Geocoding 5000 addresses in php script
I'm looking to geocode over 5000 addresses at once in a PHP script (this will only ever be run once).
I have been looking into go开发者_如何学Pythonogle as a potential resource for doing just this, however I've read reports that after running 200 or so queries through them google will kick you off for the day.
I was just wondering if there was any other way to geocode 5000 or so addresses, another service like google offers or something similar I could use?
Or will I just have to stagger this? The problem is I don't really have much time and to do 200 or 300 a day for 5000 results will take almost 5 (working) weeks.
Thanks
Tom
You could use Bing Maps instead: the Spatial Data API is made for batch geocoding thousands of addresses at once (that link is even a detailed tutorial on how to use it with PHP).
You just need to register a key at http://www.bingmapsportal.com but that's free and fast (you get the confirmation email within minutes).
Is there a limit to the number of geocode requests I can submit?
If more than 2,500 geocode requests in a 24 hour period are received from a single IP address, or geocode requests are submitted from a single IP address at too fast a rate, the Google Maps API geocoder will begin responding with a status code of 620.
[...]
If you need to submit a very large set of addresses to the Geocoding Web Service to cache for later use, you should consider Google Maps API Premier, which provides a separate batch geocoding quota for this purpose.
-- http://code.google.com/apis/maps/faq.html#geocoder_limit
As @Pekka mentioned: note that Google's terms of service forbid geocoding stuff for purposes other than showing it on a map.
The most reliable solution is to download geolocation database to your host so that you can do unlimited queries.
http://www.google.de/search?q=geolocation+database&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:ru:official&client=firefox#hl=en&expIds=17259,17315,18168,23628,25646,25834,26637,26746,26761,26849&xhr=t&q=geolocation+database+download&cp=22&pf=p&sclient=psy&client=firefox&hs=MvK&rls=org.mozilla:ru%3Aofficial&source=hp&aq=f&aqi=&aql=&oq=geolocation+database+d&gs_rfai=&pbx=1&fp=d950b79c3319a56e
As @Bart Kiers says, there's a limit on the number of requests you can do in a 24hr period; there's also a "not too fast" per-hour (?) limit. I'd suggest that you divide (seconds per day) 86400/2500 (the limit) to get a query rate that shouldn't exceed the "too fast" per/hour limit. It comes out to about one query per 35 seconds, which should get you the results in two days.
However, do check the return codes: if the service starts returning 620, stop and give it a rest for some time, else you risk a ban.
What you're trying to do is indeed not according to Google's terms of service.
That said, Google will start returning 'over-quota' responses if you don't pause at least 250mS between geocoding requests.
In practice, if you only make 2 requests a second you won't get throttled until the 2'500 day's limit.
精彩评论