开发者

Why are C#/.NET strings length-prefixed and null terminated?

A开发者_JS百科fter reading What's the rationale for null terminated strings? and some similar questions I have found that in C#/.NET strings are, internally, both length-prefixed and null terminated like in BSTR Data Type.

What is the reason strings are both length-prefixed and null terminated instead of eg. only length-prefixed?


Length prefixed so that computing length is O(1).

Null terminated to make marshaling to unmanaged blazing fast (unmanaged likely expects null-terminated strings).


Here is an excerpt from Jon Skeet's Blog Post about strings:

Although strings aren't null-terminated as far as the API is concerned, the character array is null-terminated, as this means it can be passed directly to unmanaged functions without any copying being involved, assuming the inter-op specifies that the string should be marshalled as Unicode.


Most likely, to ensure easy interoperability with COM.


While the length field makes it easy for the framework to determine the length of a string (and it lets string contain characters with a zero value), there's an awful lot of stuff that the framework (or user programs) need to deal with that expect NULL terminated strings.

Like the Win32 API, for example.

So it's convenient to keep a NULL terminator on at the end of the string data because it's likely going to need to be there quite often anyway.

Note that C++'s std::string class is implemented the same way (in MSVC anyway). For the same reason, I'm sure (c_str() is often used to pass a std::string to something that wants a C-style string).


Best guess is that finding the length is constant (O(1)) compared to traversing it, running in O(n).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜