开发者

Why are std::vector::data and std::string::data different?

Vector's new method data() provides a const and non-const version.

However string's data() method only provides a const version.开发者_开发问答

I think they changed the wording about std::string so that the chars are now required to be contiguous (like std::vector).

Was std::string::data just missed? Or is the a good reason to only allow const access to a string's underlying characters?

note: std::vector::data has another nice feature, it's not undefined behavior to call data() on an empty vector. Whereas &vec.front() is undefined behavior if it's empty.


In C++98/03 there was good reason to not have a non-const data() due to the fact that string was often implemented as COW. A non-const data() would have required a copy to be made if the refcount was greater than 1. While possible, this was not seen as desirable in C++98/03.

In Oct. 2005 the committee voted in LWG 464 which added the const and non-const data() to vector, and added const and non-const at() to map. At that time, string had not been changed so as to outlaw COW. But later, by C++11, a COW string is no longer conforming. The string spec was also tightened up in C++11 such that it is required to be contiguous, and there's always a terminating null exposed by operator[](size()). In C++03, the terminating null was only guaranteed by the const overload of operator[].

So in short a non-const data() looks a lot more reasonable for a C++11 string. To the best of my knowledge, it was never proposed.

Update

charT* data() noexcept;

was added basic_string in the C++1z working draft N4582 by David Sankel's P0272R1 at the Jacksonville meeting in Feb. 2016.

Nice job David!


Historically, the string data has not been const because it would prevent several common optimizations, like copy-on-write (COW). This is now, IIANM, far less common, because it behaves badly with multithreaded programs.

BTW, yes they are now required to be contiguous:

[string.require].5: The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().

Another reason might be to avoid code such as:

std::string ret;
strcpy(ret.data(), "whatthe...");

Or any other function that returns a preallocated char array.


Although I'm not that well-versed in the standard, it might be due to the fact that std::string doesn't need to contain null-terminated data, but it can and it doesn't need to contain an explicit length field, but it can. So changing the undelying data and e.g. adding a '\0' in the middle might get the strings length field out of sync with the actual char data and thus leave the object in an invalid state.


@Christian Rau

From the time the original Plauger (around 1995 I think) string class was STL-ized by the committee (turned into a Sequence, templatified), std::string has always been std::vector plus string-related stuff (conversion from/to 0-terminated, concatenation, ...), plus some oddities, like COW that's actually "Copy on Write and on non-const begin()/end()/operator[]".

But ultimately a std::string is really a std::vector under another name, with a slightly different focus and intent. So:

  • just like std::vector, std::string has either a size data member or both start and end data members;
  • just like std::vector, std::string does not care about the value of its elements, embedded NUL or others.

std::string is not a C string with syntax sugar, utility functions and some encapsulation, just like std::vector<T> is not T[] with syntax sugar, utility functions and some encapsulation.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜