Simple HTML DOM Parser - Skip certain element
I am using the Simple HTML DOM Parser and I want to completely ignore the contents of the "nested" element and get the contents of the proceeding "pre" element.
<div id=parent>
<div class="nested">
<pre>Text that I want ignored</pre>
</div>
<pre>
This is the text I want to access
</pre>
</div>
I don't have control of the HTML source, and the owner has recently added the "nested" element. Before I accessed the content I needed by doing so:
$page_contents = file_get_html($url);
$div_content = $page_contents->find('div[id=parent]pre', 0)->i开发者_如何转开发nnertext;
But obviously the new nested element has broken my method.
I can't seem to find any official documentation regarding this kind of scenario.
not tested but try this
$div_content = $page_contents->find('div[id=parent][class!=nested]pre', 0)->innertext;
or
$div_content = $page_contents->find('div[id=parent class!=nested]pre', 0)->innertext;
or maybe even just this I think this is really the one but again I have not tested
$div_content = $page_contents->find('div[class!=nested]pre', 1)->innertext;
still don't know if this will work but try this
$div_content = $page_contents->find('div[class!=nested pre]', 0)->innertext;
or
$div_content = $page_contents->find('div[class!=nested pre]', 0)->plaintext;
find('div[id=parent] pre')
finds all pre
tags in specified div
and doesnt care if one of them is enclosed in another div
, so heres a few suggestions:
if you know exactly which pre
you want to get, just specify the number counting from zero, in your case:
$div_content = $page_contents->find('div[id=parent] pre', 1)->innertext;
in case you dont know how many pre
are there, or dont know the order, you could just remove the one you dont want and then do the previous line, but this time specifying number 0:
$page_contents->find('div[id=parent] div[id=nested] pre', 0)->outertext = '';
$div_content = $page_contents->find('div[id=parent] pre', 0)->innertext;
and in case you dont want to change $page_contents
, just assign your parent div
to a temporary variable, and do like above:
$temp = $page_contents->find('div[id=parent]', 0);
$temp->find('div[id=nested] pre', 0)->outertext='';
$div_content = $temp->find('pre', 0)->innertext;
ofcourse there are a lot of other ways to do this, should read the manual http://simplehtmldom.sourceforge.net/manual.htm though it mentions just the main features, a lot more is under the hood
精彩评论