Php screen scraping using php simple dom parser
I am using simple html dom parser to scrape a website ... How can i skip a 开发者_开发技巧particular class while in a loop
Judging from http://simplehtmldom.sourceforge.net/manual.htm#frag_find_attr you can use:
->find("div[class!=skip_me]")
Or use the DOM methods and check with ->getAttribute("class")
against a value.
// DOM can load HTML soup. But, HTML soup can throw warnings, suppress
// them.
$htmlDom = new DOMDocument();
@$htmlDom->loadHTML($html);
if ($htmlDom) {
// It's much easier to work with simplexml than DOM, luckily enough
// we can just simply import our DOM tree.
$elements = simplexml_import_dom($htmlDom);
This is a quote (almost) from Drupal 7 SimpleTest. After this, it's a lot easier work with the document, the class can be reach as $element['class']
精彩评论