I often use Google’s geocoding API to find details about a location like this:
The drawback of this approach is the Google API limits each user to 2500 requsts per 24 hours. So if I want to geocode 1 million locations then I would need to rent a lot of proxies or else the API calls will take over a year to complete (1,000,000 / 2,500 = 400 days). To meet this use case I built a module to reverse geocode a latitude / longitude coordinate using a list of known locations from geonames.
Here is some example usage:
Internally the module uses a k-d tree to efficiently find the nearest neighbour of each given coordinate. On my netbook I find building the tree takes ~2.5 seconds and then each location query just ~1.5 ms.
The module is licensed under the LGPL on bitbucket: https://bitbucket.org/richardpenman/reverse_geocode