What's preferred pattern for reading lines from a file in C++?
I've seen at least two ways of reading lines from a file in C++ tutorials:
std::ifstream fs("myfile.txt");
if (fs.is_open()) {
while (fs.good()) {
std::string line;
std::getline(fs, line);
// ...
and:
std::ifstream fs("myfile.txt"开发者_如何学Go);
std::string line;
while (std::getline(fs, line)) {
// ...
Of course, I can add a few checks to make sure that the file exists and is opened. Other than the exception handling, is there a reason to prefer the more-verbose first pattern? What's your standard practice?
while (std::getline(fs, line))
{}
This is not only correct but preferable also because it is idiomatic.
I assume in the first case, you're not checking fs
after std::getline()
as if(!fs) break;
or something equivalent. Because if you don't do so, then the first case is completely wrong. Or if you do that, then second one is still preferable as its more concise and clear in logic.
The function good()
should be used after you made an attempt to read from the stream; its used to check if the attempt was successful. In your first case, you don't do so. After std::getline()
, you assume that the read was successful, without even checking what fs.good()
returns. Also, you seem to assume that if fs.good()
returns true, std::getline
would successfully read a line from the stream. You're going exactly in the opposite direction: the fact is that, if std::getline
successfully reads a line from the stream, then fs.good()
would return true
.
The documentation at cplusplus says about good()
that,
The function returns true if none of the stream's error flags (eofbit, failbit and badbit) are set.
That is, when you attempt to read data from an input stream, and if the attempt was failure, only then a failure flag is set and good()
returns false
as an indication of the failure.
If you want to limit the scope of line
variable to inside the loop only, then you can write a for
loop as:
for(std::string line; std::getline(fs, line); )
{
//use 'line'
}
Note: this solution came to my mind after reading @john's solution, but I think its better than his version.
Read a detail explanation here why the second one is preferable and idiomatic:
- Linux | Segmentation Fault in C++ - Due to the function ifstream
Or read this nicely written blog by @Jerry Coffin:
- Reading files
Think of this as an extended comment to Nawaz' already excellent answer.
Regarding your first option,
while (fs.good()) {
std::string line;
std::getline(fs, line);
...
This has multiple problems. Problem number 1 as that that the while
condition is in the wrong place and is superfluous. It's in the wrong place because fs.good()
indicates whether or not the most recent action performed on the file was OK. A while condition should be with respect to the upcoming actions, not the previous ones. There is no way to know whether the upcoming action on the file will be OK. What upcoming action? fs.good()
does not read your code to see what that upcoming action is.
Problem number two is that the you are ignoring the return status from std::getline()
. That's OK if you immediately check the status with fs.good()
. So, fixing this up a bit,
while (true) {
std::string line;
if (std::getline(fs, line)) {
...
}
else {
break;
}
}
Alternatively, you can do if (! std::getline(fs, line)) { break; }
but now you have a break
in the middle of the loop. Yech. It is much, much better to make the exit conditions a part of the loop statement itself if at all possible.
Compare that to
std::string line;
while (std::getline(fs, line)) {
...
}
This is the standard idiom for reading lines from a file. A very similar idiom exists in C. This idiom is very old, very widely used, and very widely viewed as the correct way to read lines from a file.
What if you come from a shop that bans conditionals with side-effects? (There are lots and lots of programming standards that do just that.) There is a way around this without resorting to the break in the middle of the loop approach:
std::string line;
for (std::getline(fs, line); fs.good(); std::getline(fs, line)) {
...
}
Not as ugly as the break approach, but most will agree that this isn't nearly as nice-looking as is the standard idiom.
My recommendation is to use the standard idiom unless some standards idiot has banned its use.
Addendum
Regarding for (std::getline(fs, line); fs.good(); std::getline(fs, line))
: This is ugly for two reasons. One is that obvious chunk of replicated code.
Less obvious is that calling getline
and then good
breaks atomicity. What if some other thread is also reading from the file? This isn't quite so important right now because C++ I/O currently is not threadsafe. It will be in the upcoming C++11. Breaking atomicity just to keep the enforcers of the standards happy is recipe for disaster.
Actually I prefer another way
for (;;)
{
std::string line;
if (!getline(myFile, line))
break;
...
}
To me it reads better, and the string is scoped correctly (i.e. inside the loop where it is being used, not outside the loop)
But of the two you've written the second is correct.
The first one deallocated and re-allocated the string every loop, wasting time.
The second time writes the string to an already existing space removing the deallocation and reallocation, making it actually faster (and better) than the first one.
Try this =>
// reading a text file
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main () {
string line;
ifstream myfile ("example.txt");
if (myfile.is_open())
{
while ( myfile.good() )
{
getline (myfile,line);
cout << line << endl;
}
myfile.close();
}
else cout << "Unable to open file";
return 0;
}
精彩评论