Asynchronous support in Python

This week Guido Van Rossum (author of Python) put out a call for experts at asynchronous programming to collaborate on a new API.

Exciting news! From my perspective Python’s poor asynchronous support is its main weakness. Currently to download webpages in parallel I have to use system threads, which use a lot of memory. This limits the number of threads I can start when crawling.

To meet this shortcoming there are external solutions such as Twisted and gevent, however I found Twisted not flexible for my use and gevent unstable.

This led me to evaluate Go and Erlang, whose strength is light threads. I found these languages interesting but there are few people in their communities involved in web scraping so I would need to build much of the infrastructure myself. For now I will stick with Python.

I really hope this move by Guido goes somewhere. When Python 3 was released in 2008 I expected it to overtake Python 2 in popularity within a year, but here we are in 2012. Good async support in Python 3 would (finally) give me incentive to switch.

blog comments powered by Disqus