C++ substr() problems when string contains special characters
I'm trying to split a c++ string into a number of substrings (NUM_LINES) each with the length of CHAR_PER_LINE.
for(int i = 0; i < NUM_LINES; i++) {
lines[i] = totalstring.substr(i*CHAR_PER_LINE,CHAR_PER_LINE);
}
Works fine as long as there's no special character in the string. Otherwise substr() gets me a string that isn't CHAR_PER_LINE characters long, but stops right before a special character and exits the loop.
Any hints?
ok, edit: 1) I'm definitely not reaching the end of my string. If my totalstring.length() is 1000 and I have a special character in the first line (that is the first CHAR_PER_LINE (30) chars of the string) the loop exits.
2) Special characters I had problems with are for instance 'ö' and '–' (the long one)
开发者_JAVA技巧EDIT 2:
std::string text = "aaaabbbbccccdödd";
std::string line[4];
for(int i = 0; i < 4; i++)
line[i] = text.substr(i*4,4);
for(int i = 0; i < 4; i++)
std::cout << line[i] << "\n";
This example works. I get a '%' for the ö. So the problem wasn't substr(). Sorry. I'm using Cairo to create a gui and it seems my Cairo output is causing the troubles, not substr().
How about a hint of what special characters you're talking about?
My guess is that you reached the end of the string.
The STL doesn't care of special characters. If there are multibyte sequences (i.e. UTF8), std::string
treats them as a sequence of single one-byte-characters. If you need proper Unicode handling, do not use the builtin substr
or length
.
You can, however, use std::wstring
(from your posting it isn't clear whether you're already using it, but I guess not) - it holds wchar_t
characters - large enough for the native character set of your target platform.
What's happening is that you're running off the end of the string on the last line. It isn't exiting the loop after skipping characters. It exits the loop precisely when it should, and the last line contains the right number of characters, it's just that some of them are garbage so your diagnositic printout is showing that the line is short.
The only way the loop could be exited early is if an exception were thrown.
精彩评论