how to parse the meta tag in the webpage [duplicate]
Possible Duplicate:
CodeIgniter: A Class/Library to help get meta tags from a web page?
can any body write a simple prog for retreiving the out put as found or not found for metatags,alltags,robots.txt file
开发者_如何学编程<?php
$url = 'example.com';
$meta = '<meta http-equiv="Content-type" content="text/html; charset=utf-8" />';
$contents = file_get_contents($url);
if(strpos($contents, $meta) !== false)
{
echo 'found';
}
else
{
echo 'not found';
}
?>
You can:
Use file_get_contents to retrieve raw HTML data
Tidy the HTML code to make it more readable; if Tidy is not installed on your web server:
apt-get install php5-tidy
Parse the ellement with DOMDocument
function get_meta($url)
{
// Get & Tidy HTML
$tidy = new tidy();
$tidy->parseFile($url, array("output-html" => true));
$tidy->cleanRepair();
// Parse XML
$xml = new DOMDocument();
$xml->loadHTML($tidy);
$meta_tags = $xml->getElementsByTagName("meta");
// Put meta informations in an array
$meta = array();
foreach($meta_tags as $meta_tag)
{
$key = $meta_tag->hasAttribute("http-equiv") ? $meta_tag->getAttribute("http-equiv") : $meta_tag->getAttribute("name");
$value = $meta_tag->hasAttribute("content") ? $meta_tag->getAttribute("content") : $meta_tag->getAttribute("value");
$meta[$key] = $value;
}
return $meta;
}
print_r(get_meta("http://php.net/manual/fr/tidy.cleanrepair.php"));
精彩评论