Figuring out Host's Top Domain with Javascript
Is there a way to figure out what the top domain for the hostname of the current page is? The problem I have is that the script could be on .com domain, or in an international domain like .co.uk
So for: jobs.telegraph.co.uk - top domain is:telegraph.co.uk jobs.nytimes.com - top domain is nytimes.com
The problem is that location.hostname , and the document.domain give the entire domain.
One route is to have a li开发者_如何学Cst of all TLDs (too much to carry around) and parse based on that. Another route was if 2 characters after last ".", than internationaltion - hence last two are the TLD, but that does not hold true for all international domains.
The Top domain is the first one you can set cookies in. Browsers will block cookies for all TLDs by default. Provided that the previous sentence is true you can exploit it to get the Top Domain for the current page.
function get_top_domain(){
var i,h,
weird_cookie='_weird_get_top_level_domain=cookie',
hostname = document.location.hostname.split('.');
for(i=hostname.length-1; i>=0; i--) {
h = hostname.slice(i).join('.');
document.cookie = weird_cookie + ';domain=' + h + ';';
if(document.cookie.indexOf(weird_cookie)>-1){
document.cookie = weird_cookie.split('=')[0] + '=;domain=' + h + ';expires=Thu, 01 Jan 1970 00:00:01 GMT;';
return h;
}
}
}
I'm not sure this is possible to do completely. It may not be meaningful in a lot of cases either. In your example, jobs.telegraph.co.uk is clearly part of the The Telegraph, which lives at telegraph.co.uk, but in other cases you have subdomains which have no relationship to the second level hostname, as is commonly found with free web hosting providers.
There are even "pseudo-NICs" such as CentralNIC which mess up the system by registering subdomains beneath domains such as uk.com, in which case there is clearly no relationship. See for example avon.uk.com.
Even if you ignore those, there are whole TLDs where the structure is a mess - .uk is one example. There are valid hostnames at the second level such as nhs.uk and mod.uk, most domains are registed at the third level such as bbc.co.uk, but .sch.uk domains can only be registered at the fourth level (i.e in the address http://learning.oriel.w-sussex.sch.uk/ you'd be looking for oriel.w-sussex.sch.uk and w-sussex.sch.uk cannot be a valid hostname)
I'm not sure if this can be done in javascript, but one possibility would be to do whois lookups at each level (i.e. jobs.telegraph.co.uk, telegraph.co.uk, .co.uk) until you get an error message along the lines of "registrations not available at this level", then accept the level below as the hostname. Unfortunately I think these messages vary by registrar, but at least there are less registrars than possible hostname permutations.
Does this do it for you?
<script>
var doms = ["telegraph.co.uk","jobs.nytimes.com"];
function getTLD(str) {
var parts = str.split('.');
var slice = (parts[parts.length-2].length==2)?parts.length-3: parts.length-2;
return parts.slice(slice).join('.')
}
for (var i=0;i<doms.length;i++) {
alert(getTLD(doms[i]));
}
</script>
精彩评论