Parse and change a file line-by-line while preserving EOL characters in Java
One hell of a long question :)
Here's how I usually do it:
StringBuilder b = new StringBuilder();
BufferedReader r = new BufferedReader(new StringReader(s));
while ((String line = r.readLine()) != null)
b.append(doSomethingToTheString(s) + "\n");
However, this replaces all the new line characters in the file with a line feed, plus it adds one at the end if t开发者_StackOverflowhere wasn't one. What I want is to preserve EOL characters, even if it's mixed up like this:
Hello\r\n
World\n
This is\r
Messed up
What would be the most elegant/efficient way to do this?
That's not a long question :)
Basically you won't be able to do anything with BufferedReader.readLine()
here. It always removes the line terminator, and there's nothing you can do about that.
However, you could look at the code within readLine()
(assuming the licence is compatible with whatever context you're writing code in) and basically perform the same task yourself, but preserving the line terminators.
If you mean to keep the line terminators, use an InputStream
instead of a Reader
. You'll need to implement your own readLine()
function that looks for a standard newline character/pair and leaves it in the return value.
If you are trying to output a file similar to the input that simply has the default line endings of the host OS, use a Writer
or append the line terminator found using System.getProperty("line.separator")
.
Here's a sketch of a solution, since I don't have time to elaborate a complete code snippet.
You need a class (say,
WeirdLine
) to represent each line, basically with a String field for the line contents, and a byte[] field for the line terminator.class WeirdLine { final String line; final byte[] term; }
You need a class (say,
WeirdLineReader
) to wrap anInputStream
. It can expose a methodreadWeirdLine()
which returns an instance ofWeirdLine
ornull
when the stream is emptyWeirdLineReader
will need to maintain an internal byte buffer. WhenreadWeirdLine()
is called, shovel bytes into the buffer (InputStream.read()
), growing it as necessary, untila.
read()
returns -1, end of file.readWeirdLine()
returns an instance with a null terminator field, and the entire contents of theString
you get fromnew String(buffer[])
.b. A
findTerminator()
method scans to find the byte sequence \r\n or \n or whatever other terminators you want to cope with. This method should also return aWeirdLine
or null, and should leave the internal buffer cleaned out/truncated if so.c. The internal buffer is simply empty, return
null
You then need to write a corresponding mechanism to write WeirdLine
s back out, preserving the terminations.
This might be easiest to use ByteBuffer
rather than a raw byte[]
for the internal buffer.
Probably can adapt the code to BufferedReader
if this sounds daunting.
精彩评论