开发者

TagSoup vs. Jsoup vs. HTML Parser vs. HotSax vs [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 9 years ago.

The abundance of HTML parsers to choose from (and stick with) is mind boggling:

http://java-source.net/open-source/html-parsers

How do I choose one that best suits the following requirements:

  1. Mature (fewer bugs than the rest)
  2. Live and breathing (i.开发者_JAVA百科e. being maintained)
  3. Fast and resource-efficient (intended to run on Android)

Based on your experience, which HTML parser would you recommend (for meeting the above requirements) and why?


Well, I found the answer, which was given by @BalusC on a different thread:

  1. If you just want to use a XML based tool to traverse it: JTidy.
  2. If you like to unit test the HTML: HtmlUnit
  3. If you like to extract specific data from the HTML: Jsoup

Thank you @BalusC.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜