开发者

How do I parse visitors by country info from alexa?

if you search alexa with any URL's you will get a detailed traffic information of the same. what I am looking into is I would like to parse Visitors by Country info from alexa.

example for google.com

url is - http://www.alexa.com/siteinfo/google.com.

on the Audience tab you can see:

Visitors by Country for Google.com

United States 35.0%

India 8.8%

China 4.1%

Germany 3.4%

United Kingdom 3.2%

Brazil 3.2%

Iran 2.8%

Japan 2.1%

Russia 2.0%

Italy 1.9%

Brazil 3.2%

Iran 2.8%

Japan 2.1%

Russia 2.0%

Italy 1.9%

Indonesia 1.7% //e开发者_如何学JAVAtc.

How can I get only these info from alexa.com?? I have tried with preg_match function but it is very difficult in this case....


If you don't want to use DOM and getElementById which is the most elegant solution in this case, you can try regexp:

$data = file_get_contents('http://www.alexa.com/siteinfo/google.com');
preg_match_all(
   '/<a href="\/topsites\/countries\/(.*)">(.*)<\/a>/mU',
   $data,
   $result,
   PREG_SET_ORDER
);

The DOM solution looks like:

$doc = new DomDocument;

$doc->loadHTMLFile('http://www.alexa.com/siteinfo/google.com');

$data = $doc->getElementById('visitors-by-country');

$my_data = $data->getElementsByTagName('div');

$countries = array();
foreach ($my_data as $node)
{
    foreach($node->getElementsByTagName('a') as $href)
    {
        preg_match('/([0-9\.\%]+)/',$node->nodeValue, $match);
        $countries[trim($href->nodeValue)] = $match[0]; 
    }
}    

var_dump($countries);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜