sitescraper | WebScraping.com

Blog

Automatic web scraping

Sitescraper Opensource Big picture January 04, 2012

I have been interested in automatic approaches to web scraping for a few years now. During university I created the SiteScraper library, which used training cases to automatically scrape webpages. This approach was particularly useful for scraping a website periodically because the model could automatically adapt when the structure was updated but the content remained static.

Read More
The sitescraper module

Sitescraper Opensource January 29, 2010

As a student I was fortunate to have the opportunity to learn about web scraping, guided by Professor Timothy Baldwin. I aimed to build a tool to make scraping web pages easier, resulting from frustration with a previous project.

Read More