Where can I learn more about parsing text in Java?
I'm in a Data Structures class (in Java) this semester, but we're doing a lot开发者_高级运维 of parsing on text files to populate the structures we design. The focus is on the structures themselves, not on parsing algorithms. I feel sort of weak in the area and was wondering if anyone could point me to a book or site on the subject. Design patterns, libraries, styles, etc. Thanks!
For parsing basic text files in Java, I would start by examining the Scanner class:
- Java Scanner Tutorial
- Java Scanner API
For any Text parsing, a basic knowledge of Regex is a good thing to have:
- Java Regex Tutorial
If Scanner isn't doing the job, you can always parse through a text file line-by-line with a BufferedReader backed by a FileReader.
BufferedReader reader = new BufferedReader(new FileReader("/path/to/file.txt"));
for (String line = reader.readLine(); line != null; line = reader.readLine())
{
//process your line here
}
Scanner may again be useful here, and you could also look into String.split(), or the java Pattern API.
- Java String.split() API
- Java Pattern API
Files can be in many formats however. For advice on the best way to parse a file of a file in a given well-defined format, google will be your friend. Or you can always post a more specific quesiton here with the format that is giving you trouble.
The book "Design Patterns" describes the structure of a recursive-descent parser.
The javacc compiler-compiler can be used to generate parsers in Java.
You can do basic text parsing with the StringTokenizer class, the String.split() methods, and the Pattern and Matcher classes for regular expressions.
精彩评论