WebScraping.com Logo

Blog

  • Parsing Flash with Swiffy

    Flash

    Google has released a tool called Swiffy for parsing Flash files into HTML5. This is relevant to web scraping because content embedded in Flash is a pain to extract, as I wrote about earlier.

    I tried some test files and found the results no more useful for parsing text content than the output produced by swf2html (Linux version). Some neat example conversions are available here. Currently Swiffy supports ActionScript 2.0 and works best with Flash 5, which was released back in 2000 so there is still a lot of work to do.

  • Scraping Flash based websites

    Flash Ajax

    Flash is a pain. It is flaky on Linux and can not be scraped like HTML because it uses a binary format. HTML5 and Apple’s criticism of Flash are good news for me because they encourage developers to use non-Flash solutions.

    The current reality though is that many sites currently use Flash to display content that I need to access. Here are some approaches for scraping Flash that I have tried: