Python and other scripting languages are sometimes dismissed because of their inefficiency compared to compiled languages like C. For example here are implementations of the fibonacci sequence in C and Python:

int fib(int n){
   if (n < 2)
     return n;
   else
     return fib(n - 1) + fib(n - 2);
}

int main() {
    fib(40);
    return 0;
}
def fib(n):
  if n < 2:
     return n
  else:
     return fib(n - 1) + fib(n - 2)
fib(40)

And here are the execution times:

$ time ./fib
3.099s
$ time python fib.py
16.655s

As expected C has a much faster execution time - 5x faster in this case.

In the context of web scraping, executing instructions is less important because the bottleneck is I/O - downloading the webpages. But I use Python in other contexts too so let’s see if we can do better.

First install psyco. On Linux this is just:

sudo apt-get install python-psyco

Then modify the Python script to call psyco:

import psyco
psyco.full()

def fib(n):
  if n < 2:
     return n
  else:
     return fib(n - 1) + fib(n - 2)
fib(40)

And here is the updated execution time:

$ time python fib.py
3.190s

Just 3 seconds - with psyco the execution time is now equivalent to the C example! Psyco achieves this by compiling code on the fly to avoid interpreting each line.

I now add the below snippet to most of my Python scripts to take advantage of psyco when installed:

try:
    import psyco
    psyco.full()
except ImportError:
    pass # psyco not installed so continue as usual