开发者

how to scrape this with Simple HTML DOM [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center. Closed 10 years ago.

I'm trying to use simple html dom to extract elements from a file that looks like this.

  • The file has several tables that look the same class=sometable.
  • Each table has a few <tr class=sometr>.
  • Then inside each tr, I have th that has the title, and a td that has a category.

What I want to extract is all titles class=title and their corresponding category number class=category for all table rows in all tables. I've loaded the file in $html. Can someone tell me what I'm supposed to find after that? I've tried even $collection = $html->find('tr'); and did a vardump on the collection but got nothing, so it looks like I'm not selecting right.

<table class="sometable">
  <tbody>
    <tr class="sometr">
      <th><a class="title">Table 1 Title1</a></th>
      <td class="category" id="categ-113"></td>
      <td class="somename">Table 1 Title 1 name</td>
    </tr>
    <tr></tr>
    <tr></tr>                           
  </tbody>
</table>

<t开发者_运维知识库able class="sometable">
</table>

<table class="sometable">
</table>


I have tested this and it works

$tables = $dom->find('table[@class="sometable"]');

foreach($tables as $table)
{
    $titles = $table->find('a[@class="title"]');
    foreach($titles as $title)
    {
        echo "Link title = " . $title ."<br />";
    }

    $categories = $table->find('td[@class="category"]');
    foreach($categories as $category)
    {
        echo "Category id = " . $category->id ."<br />";
    }

    $titles2 = $table->find('td[@class="somename"]');
    foreach($titles2 as $title2)
    {
        echo "Title2 = " . $title2 ."<br />";
    }

}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜