Retrieving text from HTML sibling elements using QueryPath in PHP
I'm extracting data from some old HTML files using PHP and the QueryPath library. Extraction is easy when the element I need has a unique css3 ID or class, but this isn't always the case. I have some files containing the following type of data:
<div id="dataDiv">
<div class="1">Heading1</div><div class="2" title="">Data1</div>
<div class="1">Heading2</div><div class="2" title="">Data2</div>
</div>
I would like to use QueryPath to search for a DIV of class "1" containing a certain string of text ("Heading2", for example), and then retrieve any text in the sibling div of class 2 directly next to it. (It would retrieve "Data2" in this case).
Is t开发者_高级运维here built in functionality in QueryPath that allows me to navigate to an element based on the text it contains? If so, once I locate that element, how can I then get the content text of its next sibling element?
My natural idea is to is the not()
function. An example :
$qp2 = qp($tb)->find('table tr')->not('table tr table tr');
Use the sibling operator in CSS 3:
qp($html, 'div.1:contains("Heading1") + div.2')->text();
The above gets <div class="1">
whose heading is Heading1
, then gets the adjacent sibling whose class is 2
.
精彩评论