开发者

Why does Delphi warn when assigning ShortString to string?

I'm converting some legacy code to Delphi 2010.

There are a fair number of old ShortStrings, like string[25]

Why does the assignment below:

type 
  S: String;
  ShortS: String[25];

...
S := ShortS;

cause the compiler to generat开发者_开发问答e this warning:

W1057 Implicit string cast from 'ShortString' to 'string'.

There's no data loss that is occurring here. In what circumstances would this warning be helpful information to me?

Thanks!

Tomw


It's because your code is implicitly converting a single-byte character string to a UnicodeString. It's warning you in case you might have overlooked it, since that can cause problems if you do it by mistake.

To make it go away, use an explicit conversion:

S := string(ShortS);


The ShortString type has not changed. It continues to be, in effect, an array of AnsiChar.

By assigning it to a string type, you are taking what is a group of AnsiChars (one byte) and putting it into a group of WideChars (two bytes). The compiler can do that just fine, and is smart enough not to lose data, but the warning is there to let you know that such a conversion has taken place.


The warning is very important because you may lose data. The conversion is done using the current Windows 8-bit character set, and some character sets do not define all values between 0 and 255, or are multi-byte character sets, and thus cannot convert all byte values.

The data loss can occur on a standard computer in a country with specific standard character sets, or on a computer in USA that has been set up for a different locale, because the user communicates a lot with people in other languages.

For instance, if the local code page is 932, the byte values 129 and 130 will both convert to the same value in the Unicode string.

In addition to this, the conversion involves a Windows API call, which is an expensive operation. If you do a lot of these, it can slow down your application.


It's safe ( as long as you're using the ShortString for its intended purpose: to hold a string of characters and not a collection of bytes, some of which may be 0 ), but may have performance implications if you do it a lot. As far as I know, Delphi has to allocate memory for the new unicode string, extract the characters from the ShortString into a null-terminated string (that's why it's important that it's a properly-formed string) and then call something like the Windows API MultiByteToWideChar() function. Not rocket science, but not a trivial operation either.


ShortStrings don't have a code page associated with them, AnsiStrings do (since D2009).

The conversion from ShortString to UnicodeString can only be done on the assumption that ShortStrings are encoded in the default ANSI encoding which is not a safe assumption.


I don't really know Delphi, but if I remember correctly, the Shortstrings are essentially a sequence of characters on the stack, whereas a regular string (AnsiString) is actually a reference to a location on the heap. This may have different implications.

Here's a good article on the different string types: http://www.codexterity.com/delphistrings.htm

I think there might also be a difference in terms of encoding but I'm not 100% sure.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜