开发者

HTML parser that is compatible with JRuby?

I'm having a difficult time locating an HTML parser that works with JRuby.

I've become fond of开发者_如何学JAVA using Nokogiri for HTML parsing, but Nokogiri requires the use of bxml2.dll, which I don't have available on my machine and am not sure that I can ensure that it is available on all users' machines.

I attempted to use another favorite, Scrubyt, but that relies on Mechanize, which also requires Nokogiri.

What Ruby HTML parser do you recommend for use with JRuby?


THe pure java version of Nokogiri does not depend on libxml2 or any binary. See http://wiki.github.com/tenderlove/nokogiri/pure-java-nokogiri-for-jruby.

Hpricot is a popular HTML parsing library that has a pure java port as well. The functionality is similar, in fact Hpricot was the parser that popularized using CSS selectors for HTML parsing.


Why not use the pure-java version of nokogiri?

http://github.com/tenderlove/nokogiri/tree/java

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜