开发者

Improving performance of ifstream in c++

I apologize if this question is a bit vague or just plain stupid, I am still very much a novice.

I need to extract information from a web log file in c++. The string manipulations are relatively, accessing the data in a timely fashion isn't. What I am doing currently

string str;

ifstream fh("testlog.log"开发者_JAVA百科,ios::in);

while (getline(fh,str));

From here I get the useful data from the string. This works fine for a log file with 100 entries, but takes forever on a log file with million+ entries. Any help would greatly be appreciated


I really suspect that I/O is hurting you more than ifstream here. Have you checked to see that you're actually CPU bound? Most likely you're having disk and cache locality issues.

There may not be a lot you can do in that case.

If it is CPU bound have you profiled to see where the CPU time is going?


After wasting hours and hours of my time, I compiled the same code in Quincy2005 instead of Microsoft Visual studio. The result was dramatic. From a 40min execution time to 1 min. The some improvement can accomplished in Microsoft Visual Studio by passing a pointer of the filehandler to the getline function. On a Linux based system it takes about 40 sec to execute. I cursed Microsoft for a good 40 min for wasting my time.


Here the fastest way I found to extract a file :

std::ifstream file("test.txt", std::ios::in | std::ios::end);

std::size_t fileSize = file.tellg();

std::vector<char> buffer(fileSize);

file.seekg(0, std::ios::beg);

file.read(buffer.data(), fileSize);

std::string str(buffer.begin(), buffer.end());

Yet, if your file is really that big, I strongly suggest you to manipulate it as a stream...


@Errata:

are you sure, that your code would be faster than say:

std::ifstream in("test.txt");
in.unsetf(std::ios::skipws);
std::string contents;
std::copy(
        std::istream_iterator<char>(in),
        std::istream_iterator<char>(),
        std::back_inserter(contents));

Also, the OP wants linewise access, which would conveniently be done:

std::ifstream in("test.txt");
in.unsetf(std::ios::skipws);
size_t count = std::count_if(
        std::istream_iterator<std::string>(in),
        std::istream_iterator<std::string>(),
        &is_interesting);
std::cout << "Interesting log lines: " << count << std::endl;

of course define a predicate, e.g.

static bool is_interesting(const std::string& line)
{ 
    return std::string::npos != line.find("FATAL");
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜