Iterating over the content of a text file line by line - is there a best practice? (vs. PMD's AssignmentInOperand)
We have a Java Application that has a few modules that know to read text files. They do it q开发者_开发百科uite simply with a code like this:
BufferedReader br = new BufferedReader(new FileReader(file));
String line = null;
while ((line = br.readLine()) != null)
{
... // do stuff to file here
}
I ran PMD on my project and got the 'AssignmentInOperand' violation on the while (...)
line.
Is there a simpler way of doing this loop other than the obvious:
String line = br.readLine();
while (line != null)
{
... // do stuff to file here
line = br.readLine();
}
Is this considered a better practice? (although we "duplicate" the line = br.readLine()
code?)
I know is an old post but I just had the same need (almost) and I solve it using a LineIterator from FileUtils in Apache Commons. From their javadoc:
LineIterator it = FileUtils.lineIterator(file, "UTF-8");
try {
while (it.hasNext()) {
String line = it.nextLine();
// do something with line
}
} finally {
it.close();
}
Check the documentation: http://commons.apache.org/proper/commons-io/javadocs/api-release/org/apache/commons/io/LineIterator.html
The support for streams and Lambdas in java-8 and Try-With-Resources of java-7 allows you to achive what you want in more compact syntax.
Path path = Paths.get("c:/users/aksel/aksel.txt");
try (Stream<String> lines = Files.lines(path)) {
lines.forEachOrdered(line->System.out.println(line));
} catch (IOException e) {
//error happened
}
I generally prefer the former. I don't generally like side-effects within a comparison, but this particular example is an idiom which is so common and so handy that I don't object to it.
(In C# there's a nicer option: a method to return an IEnumerable<string>
which you can iterate over with foreach; that isn't as nice in Java because there's no auto-dispose at the end of an enhanced for loop... and also because you can't throw IOException
from the iterator, which means you can't just make one a drop-in replacement for the other.)
To put it another way: the duplicate line issue bothers me more than the assignment-within-operand issue. I'm used to taking in this pattern at a glance - with the duplicate line version I need to stop and check that everything's in the right place. That's probably habit as much as anything else, but I don't think it's a problem.
I routinely use the while((line = br.readLine()) != null)
construct... but, recently I came accross this nice alternative:
BufferedReader br = new BufferedReader(new FileReader(file));
for (String line = br.readLine(); line != null; line = br.readLine()) {
... // do stuff to file here
}
This is still duplicating the readLine()
call code, but the logic is clear, etc.
The other time I use the while(( ... ) ...)
construct is when reading from a stream in to a byte[]
array...
byte[] buffer = new byte[size];
InputStream is = .....;
int len = 0;
while ((len = is.read(buffer)) >= 0) {
....
}
This can also be transformed in to a for loop with:
byte[] buffer = new byte[size];
InputStream is = .....;
for (int len = is.read(buffer); len >= 0; len = is.read(buffer)) {
....
}
I am not sure I really prefer the for-loop alternatives.... but, it will satisfy any PMD tool, and the logic is still clear, etc.
Based on Jon's answer I got to thinking it should be easy enough to create a decorator to act as a file iterator so you can use a foreach loop:
public class BufferedReaderIterator implements Iterable<String> {
private BufferedReader r;
public BufferedReaderIterator(BufferedReader r) {
this.r = r;
}
@Override
public Iterator<String> iterator() {
return new Iterator<String>() {
@Override
public boolean hasNext() {
try {
r.mark(1);
if (r.read() < 0) {
return false;
}
r.reset();
return true;
} catch (IOException e) {
return false;
}
}
@Override
public String next() {
try {
return r.readLine();
} catch (IOException e) {
return null;
}
}
@Override
public void remove() {
throw new UnsupportedOperationException();
}
};
}
}
Fair warning: this suppresses IOExceptions that might occur during reads and simply stops the reading process. It's unclear that there's a way around this in Java without throwing runtime exceptions as the semantics of the iterator methods are well defined and must be conformed to in order to use the for-each syntax. Also, running multiple iterators here would have some strange behavior; so I'm not sure this is recommended.
I did test this, though, and it does work.
Anyway, you get the benefit of for-each syntax using this as a kind of decorator:
for(String line : new BufferedReaderIterator(br)){
// do some work
}
Google's Guava Libraries provide an alternative solution using the static method CharStreams.readLines(Readable, LineProcessor<T>) with an implementation of LineProcessor<T>
for processing each line.
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
CharStreams.readLines(br, new MyLineProcessorImpl());
} catch (IOException e) {
// handling io error ...
}
The body of the while
loop is now placed in the LineProcessor<T>
implementation.
class MyLineProcessorImpl implements LineProcessor<Object> {
@Override
public boolean processLine(String line) throws IOException {
if (// check if processing should continue) {
// do sth. with line
return true;
} else {
// stop processing
return false;
}
}
@Override
public Object getResult() {
// return a result based on processed lines if needed
return new Object();
}
}
I'm a bit surprised the following alternative was not mentioned:
while( true ) {
String line = br.readLine();
if ( line == null ) break;
... // do stuff to file here
}
Before Java 8 it was my favorite because of its clarity and not requiring repetition. IMO, break
is a better option to expressions with side-effects. It's still a matter of idioms, though.
AssignmentInOperand is a controversial rule in PMD, the reason of this rule is: "this can make code more complicated and harder to read" (please refer http://pmd.sourceforge.net/rules/controversial.html)
You could disable that rule if you really want to do it that way. In my side I prefer the former.
精彩评论