Does XML::LibXML::Reader read HTML?
I didn't find anything about parsing HTML in the XML::LibXML::Reader documentation. And I tried to parse a HTML-site and it didn't work. Is my conclusion, that XML::LibXML::Reader doesn't work with HTML right开发者_高级运维?
Unless it's really XHTML, then no. XML is much more restrictive than HTML is, and XML parsers normally can't parse HTML.
HTML::TokeParser (or its base class HTML::PullParser) are the most similar to XML::LibXML::Reader (but not all that similar).
You might want to look at HTML-Tree for something similar to LibXML that does work with HTML. There's also HTML::TreeBuilder::LibXML, which wraps an even more LibXML-compatible interface around HTML-Tree.
No, but HTML::TreeBuilder::LibXML implements a compatible interface on an HTML paser.
精彩评论