An ongoing problem for my web scraping work is how much to quote for a job. I prefer fixed fee to hourly rates so I need to consider the complexity upfront. My initial strategy was simply to quote low to ensure I got business and hopefully build up some regular clients.
Through experience I found the following factors most effected the time required for a job:
- Website size
- Login protected
- IP restrictions
- HTML quality
- JavaScript/AJAX
I developed a formula based on these factors and have now built an interface that lets potential clients clarify the costs involved with different kinds of web scraping jobs. Additionally I hope this will reduce the communication overhead by helping clients to provide the necessary information upfront.