开发者

How to know which page is the URL redirecting to in PHP?

If I have a URL (eg. http://www.foo.com/alink.pl?page=2), I want to determine if I am being redirected to another link. I'd also like to know the final URL (eg. http://www.foo.com/other_link.pl).

I want to know how to do th开发者_运维百科at in PHP

Thank you all for your help :)

(more information:

I want to have a function that is called doesItDirect($url) which returns the url which it redirects to if true, and it returns the same url passed if false

)


If you're using cURL, you can do a curl_getinfo ($ch, CURLINFO_EFFECTIVE_URL) as documented here: http://sg.php.net/manual/en/function.curl-getinfo.php

Example:

<?php
    $ch = curl_init ('http://www.foo.com/alink.pl?page=2');
    curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);

    curl_exec ($ch);

    if (!curl_errno ($ch))
        $url = curl_getinfo ($ch, CURLINFO_EFFECTIVE_URL);

    curl_close ($ch);

    echo $url;
?>


You'll need to do a http-request to the said url and check the response headers you get. A 301 or 302 response means it's a redirect. The redirection url is included in the response headers and will look like Location: <url>.

Update: the manual provided a useful example, from which I put together this, which seems to work:

<?php  
function isRedirectUrl($url) {
    $redirectCodes = array(301, 302, 303, 307);

    if ($fp = fopen($url, 'r')) {
        $meta = stream_get_meta_data($fp);

        list($http_version, $code, $message) = explode(' ', $meta['wrapper_data'][0], 3);

        if (in_array(intval($code), $redirectCodes)) {
            foreach ($meta['wrapper_data'] as $header) {
                list($name, $value) = explode(':', $header, 2);

                if ($name == 'Location') {
                    return trim($value);
                }
            }    
        }

        fclose($fp);
    }

    return false;
}

function getCanonicalUrl($url) {
    $ret = $url;
    while ($test = isRedirectUrl($ret)) {
        if ($test) {
            $ret = $test;
        }
    }

    return $ret;
}

var_dump(getCanonicalUrl('http://<url to test>'));
?>


It's not easy.

It's not impossible, but it's pretty darn hard. These are the ways you can do a redirection:

Header Redirection.

This is where you ask for "gimmiemypage.php" and instead of sending "200 - OK" as the status, it sends a "30? - Redirected" header (Where ? is 1 or 2). This is really easy to detect, because curl will tell you. Hurrah.

HTML Refresh Redirection.

This is where you use a and one second after parsing that, the browser forwards you onwards.

This is harder to detect because you have to specifically look for meta headers, so you'll need to parse arbitary HTML (Do Not Use Regexes for this, That Would Be Bad) to find those tags. They should always be in , but those wacky karazee webdevelopers might hide them.

Then there are Javascript redirects. Finding these without evaluating the javascript to see what happens is almost impossible. There are various different ways you can redirect people in JS, but you could catch those with a parser. However, because this is JS, you'll end up needing to read and evaluate all the JS you can see on the page, and the included JS, and anything that includes...

My advice is to try and find a way that doesn't mean you need to know about all redirects, because it's a very deep well to fall into.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜