开发者

Does std::string use string interning?

I'm especially interested of windows, mingw.

Thanks.

Update: First, I thought everyone is familiar with string interning. http://en.wikipedia.org/wiki/String_interning

Second, my problem is in detail: I knocked up a string class for practice. Nothing fancy you know, i just store the size and a char * in a class.

I use memcpy for the assignment.

When i do this to measure the assignment speed of std::string and my string class:

string test1 = "  65 kb text ", test2;
for(int i=0; i<1000000; i++)
   {
   t开发者_StackOverflowest2 = test1;
   }

mystring test3 = "65 kb text", test4;
for (int i=0; i<1000000; i++)
   {
   test4 = test3
   }

The std::string is a winner by a large margin. I do not do anything in the assignment operator (in my class) but copy with memcpy. I do not even create a new array with the "new" operator, cause i check for size equality, and only request new if needed. How come?

For small strings, there is no problem. I cant see how can std::string assign values faster than memcpy, i bet it uses it too in the background, or something similar, so that's why i asked interning.

Update2: by modifying the loops with a single character assignment like this: test2[15] = 78, I avoided the effect of copy-on-write of std::string. Now both codes takes exactly the same time (okay, there is an 1-2% difference, but that is negligible). So if I am not entirely mistaken, the mingw std::string must use COW.

Thank you all for your help.


Simply put, no. String interning is not feasible with mutable strings, such as all std::string-objects.


String interning may be done by the compiler only for string literals appearing in the code. If you initialise std:strings with string literals, and some of the literals occur multiple times, the compiler may store only one copy of this string in your binary. There is no string interning at run time. mingw supports compile time string interning as explained before.


Not so much, since std::string is modifiable.

Implementations have been known to attempt the use of copy-on-write, but that causes such problems in multi-threaded code that I think it's out of fashion. It's also very hard to implement correctly - perhaps impossible? If someone takes a pointer to a character in the string, and then modifies another character, I'm not sure that this is permitted to invalidate the first pointer. If it's not allowed, then COW is out of the question too, I think, but I can't remember how it works out.


No, there is no string interning in the STL. It doesn't fit the C++ design philosophy to have such a feature.


Two ideas:

  • Is myclass a template class? The std::string class is a typedef of the template class basic_string. This means that the complete source of basic_string instead of just the header is accessible to the compiler when your test function is compiled. This additional information enables more optimisations in exchange for higher compilation time.

  • Most c++ standard library implementations are highly optimised (and sadly almost unreadable).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜