开发者

C++ - string.compare issues when output to text file is different to console output?

I'm trying to find out if two strings I have are the same, for the purpose of unit testing. The first is a predefined string, hard-coded into the program. The second is a read in from a text file with an ifstream using std::getline(), and then taken as a substring. Both values are stored as C++ strings.

When I output both of the strings to the console using cout for testing, they both appear to be identical:

ThisIsATestStringOutputtedToAFile ThisIsATestStringOutputtedToAFile

However, the string.compare returns stating they are not equal. When outputting to a text file, the two strings appear as follows:

ThisIsATestStringOutputtedToAFile T^@h^@i^@s^@I^@s^@A^@T^开发者_StackOverflow中文版@e^@s^@t^@S^@t^@r^@i^@n^@g^@O^@u^@t^@p^@u^@t^@ t^@e^@d^@T^@o^@A^@F^@i^@l^@e

I'm guessing this is some kind of encoding problem, and if I was in my native language (good old C#), I wouldn't have too many problems. As it is I'm with C/C++ and Vi, and frankly don't really know where to go from here! I've tried looking at maybe converting to/from ansi/unicode, and also removing the odd characters, but I'm not even sure if they really exist or not..

Thanks in advance for any suggestions.

EDIT Apologies, this is my first time posting here. The code below is how I'm going through the process:

ifstream myInput;
ofstream myOutput;

myInput.open(fileLocation.c_str()); 
myOutput.open("test.txt");

TEST_ASSERT(myInput.is_open() == 1);

string compare1 = "ThisIsATestStringOutputtedToAFile";
string fileBuffer;

std::getline(myInput, fileBuffer);
string compare2 = fileBuffer.substr(400,100);

cout << compare1 + "\n";
cout << compare2 + "\n";
myOutput << compare1 + "\n";
myOutput << compare2 + "\n";
cin.get();

myInput.close();
myOutput.close();

TEST_ASSERT(compare1.compare(compare2) == 0);


How did you create the content of myInput? I would guess that this file is created in two-byte encoding. You can use hex-dump to verify this theory, or use a different editor to create this file.

The simpliest way would be to launch cmd.exe and type

echo "ThisIsATestStringOutputtedToAFile" > test.txt

UPDATE:

If you cannot change the encoding of the myInput file, you can try to use wide-chars in your program. I.e. use wstring instead of string, wifstream instead of ifstream, wofstream, wcout, etc.


The following works for me and writes the text pasted below into the file. Note the '\0' character embedded into the string.

#include <iostream>
#include <fstream>
#include <sstream>

int main()
{
    std::istringstream myInput("0123456789ThisIsATestStringOutputtedToAFile\x0 12ou 9 21 3r8f8 reohb jfbhv jshdbv coerbgf vibdfjchbv jdfhbv jdfhbvg jhbdfejh vbfjdsb vjdfvb jfvfdhjs jfhbsd jkefhsv gjhvbdfsjh jdsfhb vjhdfbs vjhdsfg kbhjsadlj bckslASB VBAK VKLFB VLHBFDSL VHBDFSLHVGFDJSHBVG LFS1BDV LH1BJDFLV HBDSH VBLDFSHB VGLDFKHB KAPBLKFBSV LFHBV YBlkjb dflkvb sfvbsljbv sldb fvlfs1hbd vljkh1ykcvb skdfbv nkldsbf vsgdb lkjhbsgd lkdcfb vlkbsdc xlkvbxkclbklxcbv");
    std::ofstream myOutput("test.txt");
    //std::ostringstream myOutput;

    std::string str1 = "ThisIsATestStringOutputtedToAFile";
    std::string fileBuffer;

    std::getline(myInput, fileBuffer);
    std::string str2 = fileBuffer.substr(10,100);

    std::cout << str1 + "\n";
    std::cout << str2 + "\n";
    myOutput << str1 + "\n";
    myOutput << str2 + "\n";

    std::cout << str1.compare(str2) << '\n';

    //std::cout << myOutput.str() << '\n';
    return 0;
}

Output:

ThisIsATestStringOutputtedToAFile
ThisIsATestStringOutputtedToAFile


It turns out that the problem was that the file encoding of myInput was UTF-16, whereas the comparison string was UTF-8. The way to convert them with the OS limitations I had for this project (Linux, C/C++ code), was to use the iconv() functions. To keep the compatibility of the C++ strings I'd been using, I ended up saving the string to a new text file, then running iconv through the system() command.

system("iconv -f UTF-16 -t UTF-8 subStr.txt -o convertedSubStr.txt");

Reading the outputted string back in then gave me the string in the format I needed for the comparison to work properly.

NOTE I'm aware that this is not the most efficient way to do this. I've I'd had the luxury of a Windows environment and the windows.h libraries, things would have been a lot easier. In this case though, the code was in some rarely used unit tests, and as such didn't need to be highly optimized, hence the creation, destruction and I/O operations of some text files wasn't an issue.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜