开发者

PHP cURL to grab specific HTML

I'm using this PHP:

<?php

$curl_handle=curl_init();
curl_setopt($curl_handle,CURLOPT_URL,'http://www.notrly.com/jackbauer/');
curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT开发者_Go百科,2);
curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,1);
$buffer = curl_exec($curl_handle);
curl_close($curl_handle);

if (empty($buffer))
{
    print "Not today";
}
else
{
    print $buffer;
}
?>

There is a p tag with class "fact" in the source that i want to extract and display! How do i do it? Also is it against copyright if i use this to grab someone else HTML off of their site?


If you want to use cURL, then download the page and use a DOM-parser like:

http://simplehtmldom.sourceforge.net/

Or you could just do something like this:

include_once('simple_html_dom.php');

$dom = file_get_html('http://www.notrly.com/jackbauer/');

foreach($dom->find("div.head div.fact p.fact") as $element)
    die($element->innertext);


Take a look at strpos for looking in strings...

if (strpos($buffer, '<p class="fact">') !== FALSE) {
  print "Yay";
}


I would check out the HTML parsers mentioned in the answer to this question. As for copyright issues I think it would depend on many factors, including:

  • What are you doing with the content
  • How much of the content are you using
  • What is the copyright on the site you are scraping
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜