Get size of a std::string's string in bytes
I would like to get the bytes a std::string
's string occupies in memory, not the number of characters. The string开发者_StackOverflow contains a multibyte string. Would std::string::size()
do this for me?
EDIT: Also, does size()
also include the terminating NULL
?
std::string
operates on bytes, not on Unicode characters, so std::string::size()
will indeed return the size of the data in bytes (without the overhead that std::string
needs to store the data, of course).
No, std::string
stores only the data you tell it to store (it does not need the trailing NULL
character). So it will not be included in the size, unless you explicitly create a string with a trailing NULL
character.
You could be pedantic about it:
std::string x("X");
std::cout << x.size() * sizeof(std::string::value_type);
But std::string::value_type
is char
and sizeof(char)
is defined as 1
.
This only becomes important if you typedef
the string type (because it may change in the future or because of compiler options).
// Some header file:
typedef std::basic_string<T_CHAR> T_string;
// Source a million miles away
T_string x("X");
std::cout << x.size() * sizeof(T_string::value_type);
std::string::size()
is indeed the size in bytes.
To get the amount of memory in use by the string you would have to sum the capacity()
with the overhead used for management. Note that it is capacity()
and not size()
. The capacity determines the number of characters (charT
) allocated, while size()
tells you how many of them are actually in use.
In particular, std::string
implementations don't usually *shrink_to_fit* the contents, so if you create a string and then remove elements from the end, the size()
will be decremented, but in most cases (this is implementation defined) capacity()
will not.
Some implementations might not allocate the exact amount of memory required, but rather obtain blocks of given sizes to reduce memory fragmentation. In an implementation that used power of two sized blocks for the strings, a string with size 17
could be allocating as much as 32
characters.
Yes, size() will give you the number of char
in the string. One character in multibyte encoding take up multiple char
.
There is inherent conflict in the question as written: std::string
is defined as std::basic_string<char,...>
-- that is, its element type is char
(1-byte), but later you stated "the string contains a multibyte string" ("multibyte" == wchar_t
?).
The size()
member function does not count a trailing null. It's value represents the number of characters (not bytes).
Assuming you intended to say your multibyte string is std::wstring
(alias for std::basic_string<wchar_t,...>
), the memory footprint for the std::wstring
's characters, including the null-terminator is:
std::wstring myString;
...
size_t bytesCount = (myString.size() + 1) * sizeof(wchar_t);
It's instructive to consider how one would write a reusable template function that would work for ANY potential instantiation of std::basic_string<> like this**:
// Return number of bytes occupied by null-terminated inString.c_str().
template <typename _Elem>
inline size_t stringBytes(const std::basic_string<typename _Elem>& inString, bool bCountNull)
{
return (inString.size() + (bCountNull ? 1 : 0)) * sizeof(_Elem);
}
** For simplicity, ignores the traits and allocator types rarely specified explicitly for std::basic_string<>
(they have defaults).
精彩评论