开发者

how to parse a page that is going on 302 header?

i have to parse a page in php,the url of the page is going on 302 Moved temporarily header and is moved to a not found page.Its data can be retrieved manually through console option of firebug add on of mozilla.But if i try to parse it using php it gives me that not found page in r开发者_如何学JAVAeturn.How can i parse that page please suggest??

edit: iam doing something like this to get the page's content

$file_results = @fopen("http://www.the url to be parses","rb");
    $parsed_results='';
    if($file_results)
    {
        while ($data3 = fread($file_results,"125000"))
        $parsed_results .= $data3;
    }


You can use get_headers() to find all the headers while you're being redirected.

$url = 'http://google.com';
$headers = get_headers($url, 1);

print 'First step gave: ' . $headers[0] . '<br />';

// uncomment below to see the different redirection URLs
// print_r($headers['Location']);

// $headers['Location'] will contain either the redirect URL, or an array
// of redirection URLs
$first_redirect_url = isset($headers['Location'][0]) ?
    $headers['Location'][0] : $headers['Location'];

print "First redirection is to: {$first_redirect_url}<br />";

// assuming you have fopen wrappers enabled...
print file_get_contents($first_redirect_url);

And just keep looking till you get the resource you want?


You need to read the header, see where it is redirecting you, and make another request to get the actual resource. Kind of a pain, but thats how the protocol works. Most browsers do this transparently.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜