开发者

correct idiom for std::string constants?

I have a map that represents a DB object. I want to get 'well known' values from it

 std::map<std::string, std::string> dbo;
 ...
 std::string val = map["foo"];

all fine but it strikes me that "foo" is being converted to a temporary string on every call. Surely it would be better to have a constant std::string (of course its probably a tiny overhead compared to the disk IO that just fetched the object but its still a valid question I think). So what is the correct idiom for std::string constants?

for example - I can have

 const std::string FOO = "foo";

in a hdr, but then I get multiple copies

EDIT: No answer yet has said how to declare std::string constants. Ignore the whole map, STL, etc issue. A lot of code is heavily std::string oriented (mine certainly is) and it is natural to want constants for them without paying over and over for the memory allocation

EDIT2: took out secondary question answered by PDF from Manuel, added example of bad idiom

EDIT3: Summary of answers. Note that I have not included those that suggested creating a new string class. I am disappointed becuase I hoped there was a simple thing that would work in header file only (like const char * const ). Anyway

a) from Mark b

 std::map<int, std::string> dict;
 const int FOO_IDX = 1;
 ....
 dict[FOO_IDX] = "foo";
 ....
 std:string &val = dbo[dict[FOO_IDX]];

b) from vlad

 // str.h
 extern const std::string FOO;
 // str.cpp
 const std::string FOO 开发者_如何学JAVA= "foo";

c) from Roger P

 // really you cant do it

(b) seems the closest to what I wanted but has one fatal flaw. I cannot have static module level code that uses these strings since they might not have been constructed yet. I thought about (a) and in fact use a similar trick when serializing the object, send the index rather than the string, but it seemed a lot of plumbing for a general purpose solution. So sadly (c) wins, there is not simple const idiom for std:string


The copying and lack of "string literal optimization" is just how std::strings work, and you cannot get exactly what you're asking for. Partially this is because virtual methods and dtor were explicitly avoided. The std::string interface is plenty complicated without those, anyway.

The standard requires a certain interface for both std::string and std::map, and those interfaces happen to disallow the optimization you'd like (as "unintended consequence" of its other requirements, rather than explicitly). At least, they disallow it if you want to actually follow all the gritty details of the standard. And you really do want that, especially when it is so easy to use a different string class for this specific optimization.

However, that separate string class can solve these "problems" (as you said, it's rarely an issue), but unfortunately the world has number_of_programmers + 1 of those already. Even considering that wheel reinvention, I have found it useful to have a StaticString class, which has a subset of std::string's interface: using begin/end, substr, find, etc. It also disallows modification (and fits in with string literals that way), storing only a char pointer and a size. You have to be slightly careful that it's only initialized with string literals or other "static" data, but that is somewhat mitigated by the construction interface:

struct StaticString {
  template<int N>
  explicit StaticString(char (&data)[N]); // reference to char array
  StaticString(StaticString const&); // copy ctor (which is very cheap)

  static StaticString from_c_str(char const* c_str); // static factory function
  // this only requires that c_str not change and outlive any uses of the
  // resulting object(s), and since it must also be called explicitly, those 
  // requirements aren't hard to enforce; this is provided because it's explicit
  // that strlen is used, and it is not embedded-'\0'-safe as the
  // StaticString(char (&data)[N]) ctor is

  operator char const*() const; // implicit conversion "operator"
  // here the conversion is appropriate, even though I normally dislike these

private:
  StaticString(); // not defined
};

Use:

StaticString s ("abc");
assert(s != "123"); // overload operators for char*
some_func(s); // implicit conversion
some_func(StaticString("abc")); // temporary object initialized from literal

Note the primary advantage of this class is explicitly to avoid copying string data, so the string literal storage can be reused. There's a special place in the executable for this data, and it is generally well optimized as it dates back from the earliest days of C and beyond. In fact, I feel this class is close to what string literals should've been in C++, if it weren't for the C compatibility requirement.

By extension, you could also write your own map class if this is a really common scenario for you, and that could be easier than changing string types.


It's simple: use

extern const std::string FOO;

in your header, and

const std::string FOO("foo");

in the appropriate .cpp file.


  1. It's possible to avoid the overhead of creating a std::string when all you want is a constant string. But you'll need to write a special class for that because there's nothing similar in the STL or in Boost. Or a better alternative is to use a class like StringPiece from Chromium or StringRef from LLVM. See this related thread for more info.

  2. If you decide to stay with std::string (which you probably will) then another good option is to use the Boost MultiIndex container, which has the following feature (quoting the docs):

    Boost MultiIndex [...] provides lookup operations accepting search keys different from the key_type of the index, which is a specially useful facility when key_type objects are expensive to create.

Maps with Expensive Keys by Andrei Alexandrescu (C/C++ Users Journal, Feb. 2006) is related to your problem and is a very good read.


The correct idiom is the one you're using. 99.99% of the time there is no need to worry about the overhead of std::string's constructor.

I do wonder if std::string's constructor could be turned into an intrinsic function by a compiler? Theoretically it might be possible, but my comment above would be explanation enough for why it hasn't happened.


It appears that you already know what the string literals will be at runtime, so you can set up an internal mapping between enumerated values and an array of strings. Then you would use the enumeration instead of an actual const char* literal in your code.

enum ConstStrings
{
    MAP_STRING,
    FOO_STRING,
    NUM_CONST_STRINGS
};

std::string constStrings[NUM_CONST_STRINGS];

bool InitConstStrings()
{
    constStrings[MAP_STRING] = "map";
    constStrings[FOO_STRING] = "foo";
}

// Be careful if you need to use these strings prior to main being called.
bool doInit = InitConstStrings();

const std::string& getString(ConstStrings whichString)
{
    // Feel free to do range checking if you think people will lie to you about the parameter type.
    return constStrings[whichString];
}

Then you would say map[getString(MAP_STRING)] or similar.

As an aside, also consider storing the return value by const reference rather than copy if you don't need to modify it:

const std::string& val = map["foo"];


In C++14 you can do

const std::string FOO = "foo"s;


The issue is that std::map copies the key and values into its own structures.

You could have a std::map<const char *, const char *>, but you would have to provide functional objects (or functions) to compare the key and value data, as this stencil is for pointers. By default, the map would compare pointers and not the data the pointers point to.

The trade off is one-time copy (std::string) versus accessing a comparator (const char *).

Another alternative is to write your own map function.


I think what you're looking for is 'boost::flyweight < std::string > '

this is a logically const reference to a shared string value. very efficient storage and high performance.


My solution (having the advantage of being able to use C++11 features that weren't present when this question was previously answered):

#define INTERN(x) ([]() -> std::string const & { \
    static const std::string y = x; \
    return y; \
}())

my_map[INTERN("key")] = 5;

Yes, it's a macro and it could use a better name.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜