开发者

parsing HTML as string to get values using keyword

I have an html file which is read as a string.. i want to parse that and get values using <TD colSpan=2>Value : So there are around 10 values i should get from the html file.. how can i do that.. i am trying to use something like

startindex endindex get开发者_Python百科substring

  sMainBeginKeyword = "<td>Value : ";
  sBeginKeyword = "<td>Value : ";
  sEndKeyword = "</td>";

  main_begin_index = result.indexOf(sMainBeginKeyword);
  while (main_begin_index != -1) {
    begin_index = main_begin_index;
    end_index = result.indexOf(sEndKeyword, begin_index);
    String deloc= result.substring(begin_index + sBeginKeyword.length(), end_index);

But this looks complicated and i can not retrieve more values .. As i have a lot of values with different keywords..


This sort of thing really does need to be done using an XML or DOM parser: Trying to do it with string searches is setting yourself up for failure.

If you loaded the HTML into an XML or DOM parser, the task you're trying to do would be trivial to achieve using XPath notation to find the relevant elements.

You haven't specified which language or platform you're working on (and the code sample you've given is insufficient to be sure either), so it's hard to be any more specific.

Hope that helps.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜