How can i know the geographic origin from a web page or URL?
i'm building a web crawler and i'm trying to figure out where is a web page from. I mean, i can check the domain (for example, .com.ar ar from Argentina) but there are oth开发者_高级运维er sites, that have other domains (.com, .net) that are argentinean too, an example of these is www.taringa.net. Is an Argentinean site but with a .net domain.
So how can i do it?
Thanks.
Geo-location by IP. Do a reverse look-up on the IP address, and you can get a geographical location. These services cost money, and will only tell you physically where the server is hosted.
Do a whois on the domain. It will tell you the where the website is registered.
But remember, There is no meaning to "where is a web page from". The web has no geographic boundaries. I can run a Spanish language site out of San Jose California, and register the domain contacts in Canada. You will have no way of knowing my site is aimed at Chilean users.
You can use a whois query on the command line - or make a request to whois.arin.net and whois.xxxx.xxx depending on the result. If I map www.taringa.net to an IP, I get this:
www.taringa.net. 300 IN A 190.210.132.53
and running whois on that:
whois 190.210.132.53
gives me a ton of output:
inetnum: 190.210.132/24
status: reallocated
owner: WIROOS SRL
ownerid: AR-WISR1-LACNIC
responsible: ALBERTO NAKAYAMA
address: GRAL MIGUEL DE AZCUENAGA, 71, 4 A
address: C1029AAA - BUENOS AIRES -
country: AR
phone: +54 011 30973059 [3059]
This should generally work for any ip.
精彩评论