Response any web page in the internet from a PHP file
How can I c开发者_如何学JAVAreate a simple PHP file, which will retrieve the HTML and the Headers of any web page in the internet, change images/resources url to their full url (for example: image.gif to http://www.google.com/image.gif), and then response it?
Okay first of all to get the headers use the PHP get_headers function.
<?php
$url = "http://www.example.com/";
$headers = get_headers($url, true);
?>
Then read the content of the page into a variable.
<?php
$handle = fopen($url, r);
$content = '';
while(! feof($handle)) {
$text .= fread($handle, 8192);
}
fclose($handle);
?>
You then need to run through the content looking for resources and pre-pending the url to get the absolute path to the resource if it isn't already an absolute path. The following regex example will work on src attributes (e.g. images and javascript) and should give you a starting point to look at other resources such as CSS which uses href="". This regex won't match if a : is in the source a good indicator that it contains http:// and is therefore an absolute path. PLEASE NOTE this is by no means perfect and won't account for all sorts of weird and wonderful resource locations but it's a good start.
<?php
$pattern = '@src="([0-9A-Za-z-_/\.])+"@';
preg_match_all($pattern, $text, $matches);
foreach($matches[0] as $match) {
$src = str_replace('src="', '', $match);
$text = str_replace($match, 'src="' . $url . $src, $text);
}
print($text);
?>
<?
$file = "http://www.somesite/somepage";
$handle = fopen($file, "rb");
$text = '';
while (!feof($handle)) {
$text .= fread($handle, 8192);
}
fclose($handle);
print($text);
?>
I think what you're looking for is a PHP Proxy script. There are several on the internet - this is one I created (although don't have time to fix bugs at the moment).
I would recommend using one which is already created over one which you've written yourself, as it's not a trivial thing to do (there are better scripts than mine available as well).
精彩评论