WebScraping.com Logo

Blog

  • How to intercept secure network traffic from Android

    Android Mobile apps

    Often I need to reverse engineer how an Android app loads its data. If I can determine the relevant server endpoints then I can call them directly in future via a script and bypass the app. This process is more complex when the app uses HTTPS for network communication, which is becoming more common now that Let’s Encrypt provides SSL certificates for free.

  • How to automate Android apps with Python

    Android Mobile apps Python

    In a previous post I covered a way to monitor network activity in order to scrape the data from an Android application. Sometimes this appraoch will not work, for example if the data of interest is embedded within the app or perhaps the network traffic is encrypted. For these cases I use UIautomator, which is a Python wrapper to the Android testing framework.

  • Web scraping with python book published

    Book

    For the last 9 months or so I have been working intermittently on a book covering the web scraping skills I picked up over the years for my work. It is now available on Amazon or directly from the publisher. Or can be found on BitTorrent.

  • Heading to Oxford

    Oxford

    This year I will be fulfilling a lifelong dream of studying at Oxford University. For the next 11 months I will be completing an MSc in Computer Science, and during this time will work on projects for existing clients but probably need to limit taking on new clients.

  • How to scrape Android apps

    Android Mobile apps Google

    Google recently released the Arc Welder extension for Chrome, which allows an Android app to be run on the desktop. The aim of Arc Welder is to help make testing Android apps easier, but conveniently it also makes scraping Android apps easier too.

  • Luminati

    Business Proxies

    These days I am often contacted by businesses asking if I want to try a free trial of their service. A recent one was Luminati, which claimed to have access to millions of IP addresses. They weren’t willing to divulge much over email and their website had less information than it does now, so we set up a Skype call. My contact was a salesman so he wasn’t able to answer technical questions, but gave me a good overview of what they are trying to do. Apparently they are an Israeli startup that built a peer to peer network called Hola, where users install a plugin to access content that is blocked in their region by downloading via other peers in the network. Now that they had millions of users they wanted to monetize this network by reselling it as a proxy service. Great idea, though when I signed up for a test account with Hola this was not clear, so I doubt most users are aware their bandwidth is being resold.

  • Phone calls

    Business

    I have noticed that if for some reason a new client can not describe what they are after in an email, but want a quick phone call to clarify, this conversation will rarely develop into an actual project. I guess they want to pick my brain and then implement it internally or hire someone cheaper, so to limit the amount of time wasted I have started insisting an overview of the project be sent before setting up a phone call.

  • Bitcoin

    Business Website

    A few years ago I opened a US bank account so that US clients who wanted to pay by bank transfer could avoid needing to make an international transaction. This worked well until last month when clients started reporting their transfers were being rejected. I rang the bank (Chase) and after being transferred between a few departments was told I needed to come into a US branch with my passport to discuss the problem, which couldn’t be handled over the phone. Quite inconvenient because I don’t live in the US and didn’t plan to visit in the near future.

  • Android database update

    Database Website Mobile apps

    A significant update to the Android Apps database is now ready, which now contains over 2 million apps (2,130,732 to be exact). If you have purchased this database previously you can login to your account to download the updated version for free.

  • Web Scraping or Web Scrapping?

    Business Website

    I searched my email and found over the last few years I received 76 messages from clients containing the text Web Scrapping rather than the usual spelling Web Scraping. And this is not unique to my clients - currently Google has 122,000 results for “Web Scrapping” compared to 447,000 results for “Web Scraping” - the correct spelling returns only 4x the number of results. So in light of this common spelling mistake I registered the domain webscrapping.com and redirected it here.