开发者

How to use regex's to parse a file in Java?

I'm trying to use a series of regular expressions to parse tokens from a file. I need to count newlines and be able to separate tokens that don't have a space between them. Unfortunately java.util.Scanner's findWithinHorizon() method searches the entire rest of the input stream (up to horizon) for the START of the regex match, but I want to match the regex starting at the current file position. Specifically, I have a bunch of regex's an开发者_如何学编程d want to loop through them to see which one matches starting at the current position in the file, and then advance the file position to right after the regex match, and continue. Is this possible?

Scanner's next() method seems to be useless for this because it enforces delimiters and the regex must match the entire token; I want to match from the current file position, get the matched string, and advance the file seek to after the match.


Options:

  1. Read the whole file into memory as a String. Then use Matcher directly at the positions you want to.

  2. Use a FileChannel acquired from a RandomAccessFile as the input for the Scanner. You can then directly manipulate the position of the channel.

  3. Use a FileChannel as above, but use Matcher directly for greater flexibility.

An example of using a Matcher with a RandomAccessFile:

FileChannel fc = file.getChannel();
fc.lock(); // so it doesn't change under you

ByteBuffer bb = ByteBuffer.allocate(BUFFER_SIZE);
CharBuffer cb = bb.asCharBuffer();

fc.read(bb);
Matcher matcher = pattern.matcher(cb);
// etc.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜