开发者

C++ substr() problems when string contains special characters

I'm trying to split a c++ string into a number of substrings (NUM_LINES) each with the length of CHAR_PER_LINE.

 for(int i = 0; i < NUM_LINES; i++) {
 lines[i] = totalstring.substr(i*CHAR_PER_LINE,CHAR_PER_LINE);
 }

Works fine as long as there's no special character in the string. Otherwise substr() gets me a string that isn't CHAR_PER_LINE characters long, but stops right before a special character and exits the loop.

Any hints?


ok, edit: 1) I'm definitely not reaching the end of my string. If my totalstring.length() is 1000 and I have a special character in the first line (that is the first CHAR_PER_LINE (30) chars of the string) the loop exits.

2) Special characters I had problems with are for instance 'ö' and '–' (the long one)

开发者_JAVA技巧

EDIT 2:

std::string text = "aaaabbbbccccdödd";
std::string line[4];

for(int i = 0; i < 4; i++) 
    line[i] = text.substr(i*4,4);


for(int i = 0; i < 4; i++)
    std::cout << line[i] << "\n";

This example works. I get a '%' for the ö. So the problem wasn't substr(). Sorry. I'm using Cairo to create a gui and it seems my Cairo output is causing the troubles, not substr().


How about a hint of what special characters you're talking about?

My guess is that you reached the end of the string.


The STL doesn't care of special characters. If there are multibyte sequences (i.e. UTF8), std::string treats them as a sequence of single one-byte-characters. If you need proper Unicode handling, do not use the builtin substr or length.

You can, however, use std::wstring (from your posting it isn't clear whether you're already using it, but I guess not) - it holds wchar_t characters - large enough for the native character set of your target platform.


What's happening is that you're running off the end of the string on the last line. It isn't exiting the loop after skipping characters. It exits the loop precisely when it should, and the last line contains the right number of characters, it's just that some of them are garbage so your diagnositic printout is showing that the line is short.

The only way the loop could be exited early is if an exception were thrown.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜