开发者

php extract info from a html page

I have this code

<input type=hidden name="code1" value="AA-T5301">
    <tr>
        <td align=left valign=middle class="stdtext">
            AA-T5301
        </a>
        </td>
        <td valign=middle align=left class="stdtext">
            <a onMouseOver="window.status='See the more info on '; return true"
                HREF="product.asp?ms=&dept_id=322&sku=32&nav=">
                Grapeseed Oil 150ml
            </A>
        </td>
        <td valign=middle align=right class="stdtext">Order Now</td>
        <td valign=middle align=right class="stdtext">
            <font class="productsale">
                <strike>£3.04</strike>
                &#160;
            </font>
            £2.04
        </td>
        <td valign=middle align=right class="stdtext">
            <input type=text size=4 name="qty_AA-T5301" value="0">
        </td>
    </tr>
    <input type=hidden name="code2" value="AA-T5302">
        <tr>
            <td align=left valign=middle class="stdtext">
                AA-T5302
            </a>
            </td>
            <td valign=middle align=left class="stdtext">
                <a onMouseOver="window.status='See the more info on '; return true"
                    HREF="product.asp?ms=&dept_id=322&sku=143&nav=">
                    Grapeseed Oil 500ml
                </A>
            </td>
            <td valign=middle align=right class="stdtext">Order Now</td>
            <td valign=middle align=right class="stdtext">
                <font class="productsale">
                    <strike>£6.46</strike>
                    &#160;
                </font>
                £4.33
            </td>
            <td valign=middle align=right class="stdtext">
                <input type=text size=4 name="qty_AA-T5302" value="0">
            </td>
        </tr>
        <input type=hidden name="code3" value="AA-T530">
            <tr>
                <td align=left valign=middle class="stdtext">
                    AA-T530
                </a>
                </td>
                <td valign=middle align=left class="stdtext">
                    <a onMouseOver="window.开发者_StackOverflowstatus='See the more info on '; return true"
                        HREF="product.asp?ms=&dept_id=322&sku=19&nav=">
                        Grapeseed Oil 50ml
                    </A>
                </td>
                <td valign=middle align=right class="stdtext">Out of Stock</td>
                <td valign=middle align=right class="stdtext">
                    <font class="productsale">
                        <strike>£1.75</strike>
                        &#160;
                    </font>
                    £1.17
                </td>
                <td valign=middle align=right class="stdtext">
                    <input type=text size=4 name="qty_AA-T530" value="0">
                </td>
            </tr>

How can i extract the info into an array so i have something like this..

product_code_array=(AA-T5301,AA-T5302,AA-T530);

RRP_array=(3.04,6.46,1.75);

price_array=(2.04,4.33,1.17);

Note: There maybe more than 3 items on a page at a time or there may only be 1


You can try and use the DOMDocument class, but you have to fix your html. You have end link tags (</a>) without any start link tags (<a href=""> )


<?php
 $text = '<input type=hidden name="code1" value="AA-T5301">
    <tr>
        <td align=left valign=middle class="stdtext">
            AA-T5301
        </a>
        </td>
        <td valign=middle align=left class="stdtext">
            <a onMouseOver="window.status=\'See the more info on \'; return true"
                HREF="product.asp?ms=&dept_id=322&sku=32&nav=">
                Grapeseed Oil 150ml
            </A>
        </td>
        <td valign=middle align=right class="stdtext">Order Now</td>
        <td valign=middle align=right class="stdtext">
            <font class="productsale">
                <strike>£3.04</strike>
                &#160;
            </font>
            <span id="now">£2.04</span>
        </td>
        <td valign=middle align=right class="stdtext">
            <input type=text size=4 name="qty_AA-T5301" value="0">
        </td>
    </tr>
    <input type=hidden name="code2" value="AA-T5302">
        <tr>
            <td align=left valign=middle class="stdtext">
                AA-T5302
            </a>
            </td>
            <td valign=middle align=left class="stdtext">
                <a onMouseOver="window.status=\'See the more info on \'; return true"
                    HREF="product.asp?ms=&dept_id=322&sku=143&nav=">
                    Grapeseed Oil 500ml
                </A>
            </td>
            <td valign=middle align=right class="stdtext">Order Now</td>
            <td valign=middle align=right class="stdtext">
                <font class="productsale">
                    <strike>£6.46</strike>
                    &#160;
                </font>
                <span id="now">£4.33</span>
            </td>
            <td valign=middle align=right class="stdtext">
                <input type=text size=4 name="qty_AA-T5302" value="0">
            </td>
        </tr>
        <input type=hidden name="code3" value="AA-T530">
            <tr>
                <td align=left valign=middle class="stdtext">
                    AA-T530
                </a>
                </td>
                <td valign=middle align=left class="stdtext">
                    <a onMouseOver="window.status=\'See the more info on \'; return true"
                        HREF="product.asp?ms=&dept_id=322&sku=19&nav=">
                        Grapeseed Oil 50ml
                    </A>
                </td>
                <td valign=middle align=right class="stdtext">Out of Stock</td>
                <td valign=middle align=right class="stdtext">
                    <font class="productsale">
                        <strike>£1.75</strike>
                        &#160;
                    </font>
                    <span id="now">£1.17</span>
                </td>
                <td valign=middle align=right class="stdtext">
                    <input type=text size=4 name="qty_AA-T530" value="0">
                </td>
            </tr>';

    $values = array();
    preg_match_all("#\<input type\=hidden name\=\"code[0-9]\" value\=\"(.*)\"\>#isU", $text, $values[0]);
    preg_match_all("#\<strike\>£([0-9\.]+)\<\/strike\>#isU" ,$text, $values[1]);
    preg_match_all("#\<span id\=\"now\"\>£([0-9\.]+)\<\/span\>#isU" ,$text, $values[2]);

    $product_code_array = $values[0][1];
    $RRP_array = $values[1][1];
    $price_array = $values[2][1];
?>


Look at this code for a start:

$dom = new domDocument;
$dom->strictErrorChecking = false;
$dom->preserveWhiteSpace = true;        
@$dom->loadHTML($html-data);
$trs=$dom->getElementsByTagName('tr');
foreach ($trs as $tr) 
{
   // now go through those elements. lookup the PHP doc on the DOM parser
   // You can iterate through sub-elements (like the 'td') just like through the 'tr'
}

You can also directly go for or just as you need it. Depending on the layout of your page you might want to go for a "higher" node so you know where you are based on tag parameters or the tag position.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜