PHP scrape title, IF
Basically I a开发者_JS百科m wanting to get the title of pages,
I want it to return TRUE if title is like:
<title>
Site Name - Page</title>
but return false if title is like:
<title>
Site Name - </title>
How can I go about inputting a URL into an fopen, checking the title and then returning TRUE/FALSE depending on the title, we only want it to be TRUE if there is text after the "-" in the title tag.
Here is the code I am currently working with:
while ($r = mysql_fetch_array($q)){
$url = "http://www.sitename/" . strtolower($r['z'] . "." . $r['x']) . "/";
$file = fopen(($url),"r") or die ("Can't read input stream");
$text = fread($file,32768);
if (preg_match('/<title>(.*?)<\/title>/is',$text,$found)) {
$title = 1;
} else {
$title = 0;
}
fclose($file);
}
I haven't verified your code for opening the URL, but I do see that your regex could be improved upon. Try this...
/<title>.+\s-\s.+<\/title>/is
where
.+
ensures there is atleast on character before and after the dash, and
\s-\s
ensures that there is a " - " separating the first and second part of the title tag.
I would wrap the title check in a function like this:
function check_title($url){
$html = file_get_contents($url);
return (preg_match("/\<title\>(.+)-(.+)\<\/title\>/i", $html))? TRUE: FALSE;
}
and you could use it like this:
while ($r = mysql_fetch_array($q)){
$url = "http://www.sitename/" . strtolower($r['z'] . "." . $r['x']) . "/";
$title = check_title($url);
}
精彩评论