php extract info from a html page
I have this code
<input type=hidden name="code1" value="AA-T5301">
<tr>
<td align=left valign=middle class="stdtext">
AA-T5301
</a>
</td>
<td valign=middle align=left class="stdtext">
<a onMouseOver="window.status='See the more info on '; return true"
HREF="product.asp?ms=&dept_id=322&sku=32&nav=">
Grapeseed Oil 150ml
</A>
</td>
<td valign=middle align=right class="stdtext">Order Now</td>
<td valign=middle align=right class="stdtext">
<font class="productsale">
<strike>£3.04</strike>
 
</font>
£2.04
</td>
<td valign=middle align=right class="stdtext">
<input type=text size=4 name="qty_AA-T5301" value="0">
</td>
</tr>
<input type=hidden name="code2" value="AA-T5302">
<tr>
<td align=left valign=middle class="stdtext">
AA-T5302
</a>
</td>
<td valign=middle align=left class="stdtext">
<a onMouseOver="window.status='See the more info on '; return true"
HREF="product.asp?ms=&dept_id=322&sku=143&nav=">
Grapeseed Oil 500ml
</A>
</td>
<td valign=middle align=right class="stdtext">Order Now</td>
<td valign=middle align=right class="stdtext">
<font class="productsale">
<strike>£6.46</strike>
 
</font>
£4.33
</td>
<td valign=middle align=right class="stdtext">
<input type=text size=4 name="qty_AA-T5302" value="0">
</td>
</tr>
<input type=hidden name="code3" value="AA-T530">
<tr>
<td align=left valign=middle class="stdtext">
AA-T530
</a>
</td>
<td valign=middle align=left class="stdtext">
<a onMouseOver="window.开发者_StackOverflowstatus='See the more info on '; return true"
HREF="product.asp?ms=&dept_id=322&sku=19&nav=">
Grapeseed Oil 50ml
</A>
</td>
<td valign=middle align=right class="stdtext">Out of Stock</td>
<td valign=middle align=right class="stdtext">
<font class="productsale">
<strike>£1.75</strike>
 
</font>
£1.17
</td>
<td valign=middle align=right class="stdtext">
<input type=text size=4 name="qty_AA-T530" value="0">
</td>
</tr>
How can i extract the info into an array so i have something like this..
product_code_array=(AA-T5301,AA-T5302,AA-T530);
RRP_array=(3.04,6.46,1.75);
price_array=(2.04,4.33,1.17);
Note: There maybe more than 3 items on a page at a time or there may only be 1
You can try and use the DOMDocument class, but you have to fix your html. You have end link tags (</a>
) without any start link tags (<a href="">
)
<?php
$text = '<input type=hidden name="code1" value="AA-T5301">
<tr>
<td align=left valign=middle class="stdtext">
AA-T5301
</a>
</td>
<td valign=middle align=left class="stdtext">
<a onMouseOver="window.status=\'See the more info on \'; return true"
HREF="product.asp?ms=&dept_id=322&sku=32&nav=">
Grapeseed Oil 150ml
</A>
</td>
<td valign=middle align=right class="stdtext">Order Now</td>
<td valign=middle align=right class="stdtext">
<font class="productsale">
<strike>£3.04</strike>
 
</font>
<span id="now">£2.04</span>
</td>
<td valign=middle align=right class="stdtext">
<input type=text size=4 name="qty_AA-T5301" value="0">
</td>
</tr>
<input type=hidden name="code2" value="AA-T5302">
<tr>
<td align=left valign=middle class="stdtext">
AA-T5302
</a>
</td>
<td valign=middle align=left class="stdtext">
<a onMouseOver="window.status=\'See the more info on \'; return true"
HREF="product.asp?ms=&dept_id=322&sku=143&nav=">
Grapeseed Oil 500ml
</A>
</td>
<td valign=middle align=right class="stdtext">Order Now</td>
<td valign=middle align=right class="stdtext">
<font class="productsale">
<strike>£6.46</strike>
 
</font>
<span id="now">£4.33</span>
</td>
<td valign=middle align=right class="stdtext">
<input type=text size=4 name="qty_AA-T5302" value="0">
</td>
</tr>
<input type=hidden name="code3" value="AA-T530">
<tr>
<td align=left valign=middle class="stdtext">
AA-T530
</a>
</td>
<td valign=middle align=left class="stdtext">
<a onMouseOver="window.status=\'See the more info on \'; return true"
HREF="product.asp?ms=&dept_id=322&sku=19&nav=">
Grapeseed Oil 50ml
</A>
</td>
<td valign=middle align=right class="stdtext">Out of Stock</td>
<td valign=middle align=right class="stdtext">
<font class="productsale">
<strike>£1.75</strike>
 
</font>
<span id="now">£1.17</span>
</td>
<td valign=middle align=right class="stdtext">
<input type=text size=4 name="qty_AA-T530" value="0">
</td>
</tr>';
$values = array();
preg_match_all("#\<input type\=hidden name\=\"code[0-9]\" value\=\"(.*)\"\>#isU", $text, $values[0]);
preg_match_all("#\<strike\>£([0-9\.]+)\<\/strike\>#isU" ,$text, $values[1]);
preg_match_all("#\<span id\=\"now\"\>£([0-9\.]+)\<\/span\>#isU" ,$text, $values[2]);
$product_code_array = $values[0][1];
$RRP_array = $values[1][1];
$price_array = $values[2][1];
?>
Look at this code for a start:
$dom = new domDocument;
$dom->strictErrorChecking = false;
$dom->preserveWhiteSpace = true;
@$dom->loadHTML($html-data);
$trs=$dom->getElementsByTagName('tr');
foreach ($trs as $tr)
{
// now go through those elements. lookup the PHP doc on the DOM parser
// You can iterate through sub-elements (like the 'td') just like through the 'tr'
}
You can also directly go for or just as you need it. Depending on the layout of your page you might want to go for a "higher" node so you know where you are based on tag parameters or the tag position.
精彩评论