Why Python

Sometimes people ask why I use Python instead of something faster like C/C++. For me the speed of a language is a low priority because in my work the overwhelming amount of execution time is spent waiting for data to be downloaded rather than programming instructions to finish. So it makes sense to use whatever language I can write good code fastest in, which is currently Python because of its high level syntax and excellent ecosystem of libraries. ESR wrote an article on why he likes Python that I expect resonates with many.

Additionally Python is an interpreted language so it is easier for me to distribute my solutions to clients than would be for a compiled language like C. Most of my scraping jobs are relatively small so distribution overhead is important.

A few people have suggested I use ruby instead. I have used ruby and like it, but found it lacks the depth of libraries available to Python.

However Python is by no means perfect - for example there are limitations with threading, using unicode is awkward, and distributing on Windows can be difficult. And there are also many redundant or poorly designed builtin libraries. Some of these issues are being addressed in Python 3, some not.

If I was to ever change language I expect it would be to something more equipped for parallel programming like erlang or haskell.

blog comments powered by Disqus