开发者

Dealing with end-of-line characters cross-platform in C++

I'm busy writing a generic textfile reader class and I'm struggling to write the code to deal correctly with end-of-line (EOL) characters for Mac, Linux and Windows.

I've done a some reading on the issue and I came up with the following function within my TextFileReader class to strip EOL characters, once I've read the contents of a textfile using getline( ) and stored the strings in a map.

//! Strip End-Of-Line characters.
void TextFileReader::stripEndOfLineCharacters( )
{
    // Search through container of data and remove newline characters.
    string::size_type stringPosition_ = 0;
    string searchString_ = "\r";
    string replaceString_ = "";

    for ( unsigned int i = 0; i < 1; i++ )
    {
        for ( iteratorContainerOfDataFromFile_
              = containerOfDataFromFile_.begin( );
              iteratorContainerOfDataFromFile_
              != containerOfDataFromFile_.end( );
              iteratorContainerOfDataFromFile_++ )
            {
                while ( ( stringPosition_ = iteratorContainerOfDataFromFile_
                          ->second.find( searchString_,
                                         stringPosition_ ) ) != string::npos )
                {
                    // Replace search string with replace string.
                    iteratorContainerOfDataFromFile_->second
                        .replace( stringPosition_, searchString_.size( ),
                                  replaceString_ );

                    // Advance string position.
                    stringPosition_++;
                }
            }

        // Switch search string.
        searchString_ = "\n";
    }
}

I thought that this would eliminate all EOL characters cross-platform but that doesn't seem to be the case. It works fine on my Mac, running Mac OS 10.5.8. It doesn't seem to work on Windows systems though. Strangely, on Windows systems running this function on strips the EOL character for the first string in the map and the rest of them are still one character too long.

This leads me to thinking that maybe I can't just replace the "\r" and "\n" characters, but everything I read suggests that it's the combination of the two that Windows uses to represent EOL characters.

To make it more explicit, here's a step-by-step layout of what I'm attempting to do. I have two textfiles called testFileMadeWithWindows.txt and testFileMadeWithMac.txt.

Open the first file with Notepad on a Windows machine and it contains the follows.

This is line 1.

This is line 2.

This is line 3.

Open the second file with TextEdit on a Mac and it contains the follows.

This is line 1.

This is line 2.

This is line 3.

In other words, the file content of both files is intended to be identical. I want to read both these files using my FileReader class and store the strings in maps. To achieve this I use the getline() function.

When I read in testFileMadeWithWindows.txt using getline( ), it turns out that the string sizes are as follows:

16

16

15

Similarly, when I read in testFileMadeWithMac.txt using getline( ), it turns out that the string sizes are as follows:

16

16

15

I now execute the stripEndOfLineCharac开发者_StackOverflow社区ters( ) function that I posted in my first post on maps containing this data.

For testFileMadeWithWindows.txt this results in the following string sizes:

15

16

15

For testFileMadeWithMac.txt this results in the following string sizes:

15

15

15

I use string::compare to compare the strings I have read in from the textfiles with the expected string data, which should be:

This is line 1.

This is line 2.

This is line 3.

The Windows comparison fails, specifically the comparison with the second line fails. The Mac comparison is successful for all three strings. I would like to know how to solve this such that the Windows comparison is successful too.

Any input would be appreciated. Thanks in advance!

Kartik


The best way to do this is to always open your fstreams in text mode (ie, without fstream::binary) and that way the EOLs (whatever they might be on the current platform) will get converted into single '\n' characters for you, and that is all you have to worry about...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜