Do compilers usually have special optimizations for strings?

2023-04-01 12:34 问答作者：

Often times you see things like

std::map<std::string, somethingelse> m_named_objects;

std::string state;

//...

if(state == "EXIT")
   exit();
else if(state == "california")
   hot();

where people use strings purely to make something more readable. The same thing could easily be achieved with something like integer-IDs.

Can modern compilers (msvc, g++, etc.) usually employ special optimizations for these types of cas开发者_如何转开发es? Or should this be avoided because of bad performance or for other reasons?

Can modern compilers (msvc, g++, etc.) usually employ special optimizations for these types of cases?

As far as I know, compilers don't make those kinds of optimizations. It's definitely not a "standard" optimization.

...where people use strings purely to make something more readable.

At least for your second case, it seems to me that enumerations are more readable and can be faster (since integer comparisons are rather cheap relative to string comparison).

enum State
{
    Alabama,
    Alaska,
    Arizona,
    Arkansas, 
    California,
    Colorado,
    Connecticut,
    Delaware,
    // ... More
};

// ...

State state = California;
if(state == California) { /* true */ }

Libraries do.

Compilers might optimize by aliasing shared/identical static strings (assuming that they really are treated as constants).

All C++ standard library implementation I'm currently aware of, sport a 'small string optimization', meaning that no extra heap allocation needs to occur for small strings; I.e.

std::string a("small");

will be fully auto (stack) allocated - in highly optimized cases perhaps even register allocated(?)

If you need blazingly fast string lookups and can afford some time spent building your datastructure, look at Tries (WP: Trie, Radix_tree)

As far as drop-in replacements go usually a lot can be gained by using a properly tuned hash map instead of a RB-tree based one:

~~std::map<std::string, somethingelse> m_named_objects;~~

replace by

std::unordered_map<std::string, somethingelse> m_named_objects;

Be happy

In the examples given the compiler generally cannot optimize because the content is runtime dependent.

std::map<std::string, int> does not have the most desirable performance characteristics as operator<() on a std::string is relatively expensive.

Optimizations for strings are for libraries, not compilers. If you want string-like identifiers, enums are one possibility. But a better one, particularly for printing and debugging, is a fixed-length identifier string class.

It would be convertible to const char * and std::string, but it would have zero memory allocations. Instead, it would just be a wrapper around a 32-character (or whatever you want) array.

The best part is that, since it's an identifier, you don't care about ASCII character-by-character comparisons. operator< can just read the 32-character array as 8 uint32_ts, or even as 4 uint64_ts. All you need is an ordering, not a specific ordering. operator== can do similar tests.

It's a pretty simple class to write. If you want case-insensitive comparisons, you could just convert the string to lowercase when you copy it into the object.

If you need strings longer than 31 bytes (one for the \0 terminator), then I would suggest truncating the string down to size. But truncate from the middle of the given string, not the end. The beginnings and end of identifiers tend to be more unique than the middle. You could even put some special characters in a truncated string to identify that it is a truncated version.

It is also possible to take this idea and put a hash in the string. So the first 4 bytes would be a hash of the original string, not of the truncation. Comparison tests would just use the hash, and the other 28 bytes are there to make it human-readable.

继续阅读：compiler-optimization stdstring string

Do compilers usually have special optimizations for strings?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？