Getting started with a parser in Java code
I am new to parsers. I like to fetch specific data from a website. I need to use parsers for that. How to get started with pa开发者_如何学Crsers? What do I need to download? What would the code be to fetch the data from a website using parsers in Java?
My advice would be to use an open source HTML parser such as HTMLCleaner - http://htmlcleaner.sourceforge.net/
You can use HTMLCleaner (or similar) to create a representation of the web page DOM, and then use this to extract whatever information you want from the web pages.
The process looks something like this:
URL url = new URL("website you want to load");
HTMLCleaner h = new HTMLCleaner();
TagNode HtmlNode = h.clean(url.openStream());
//perform queries on the DOM to extract information
精彩评论