开发者

Delphi - Store WideStrings inside a program

In the past I used INI-开发者_Go百科Files to store unicode text, but now I need to store unicode text in the executable. How can I achieve this?

I want to store these letters:

āčēūīšķļņž


If you want to save the Unicode INI files then you might try the following code. The files are saved in UTF8 encoding.

Also you might take a look at this Unicode library where you can find a lot of helper functions.

uses IniFiles;

function WideStringToUTF8(const Value: WideString): AnsiString;
var
  BufferLen: Integer;
begin
  Result := '';

  if Value <> '' then
  begin
    BufferLen := WideCharToMultiByte(CP_UTF8, 0, PWideChar(Value), -1, nil, 0, nil, nil);
    SetLength(Result, BufferLen - 1);
    if BufferLen > 1 then
      WideCharToMultiByte(CP_UTF8, 0, PWideChar(Value), -1, PAnsiChar(Result), BufferLen - 1, nil, nil);
  end;
end;

function UTF8ToWideString(const Value: AnsiString): WideString;
var
  BufferLen: integer;
begin
  Result := '';

  if Value <> '' then
  begin
    BufferLen := MultiByteToWideChar(CP_UTF8, 0, PAnsiChar(Value), -1, nil, 0);
    SetLength(Result, BufferLen - 1);
    if BufferLen > 1 then
      MultiByteToWideChar(CP_UTF8, 0, PAnsiChar(Value), -1, PWideChar(Result), BufferLen - 1);
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  IniFile: TIniFile;
const
  UnicodeValue = WideString(#$0101#$010D#$0113#$016B#$012B#$0161);
begin
  IniFile := TIniFile.Create('C:\test.ini');

  try
    IniFile.WriteString('Section', 'Key', WideStringToUTF8(UnicodeValue));
    IniFile.UpdateFile;
  finally
    IniFile.Free;
  end;
end;

procedure TForm1.Button2Click(Sender: TObject);
var
  IniFile: TIniFile;
  UnicodeValue: WideString;
begin
  IniFile := TIniFile.Create('C:\test.ini');

  try
    UnicodeValue := UTF8ToWideString(IniFile.ReadString('Section', 'Key', 'Default'));
    MessageBoxW(Handle, PWideChar(UnicodeValue), 'Caption', 0);
  finally
    IniFile.Free;
  end;
end;

Delphi - Store WideStrings inside a program


with Delphi 2007 on 64-bit Windows 7 Enterprise SP 1


If you definitely need to use Delphi 7 there are some variants:

  1. Store strings in resources linked to executable file.

  2. Store strings in big memo or same thing, located on global data module or any other visual or non-visual component and access it by index. It's possible because strings in Delphi resources stored in XML-encoded form. E.g. your symbols example āčēūīšķļņž will be stored as &#257;&#269;&#275;&#363;&#299;&#353;&#311;&#316;&#326;&#382;

  3. Store XML-encoded or Base64-encoded strings in string constants inside your code.

For string conversion you can use EncdDecd.pas , xdom.pas or some functions of System.pas like UTF8Encode/UTF8Decode.

To display and edit Unicode strings in Delphi forms you can use special set of Unicode controls like TNT Unicode Controls or subclass original Delphi controls and do some other workarounds by yourself, like described in this excerpt from comments in TntControls.pas (part of TNT Unicode Controls):

Windows NT provides support for native Unicode windows. To add Unicode support to a TWinControl descendant, override CreateWindowHandle() and call CreateUnicodeHandle().

One major reason this works is because the VCL only uses the ANSI version of SendMessage() -- SendMessageA(). If you call SendMessageA() on a UNICODE window, Windows deals with the ANSI/UNICODE conversion automatically. So for example, if the VCL sends WM_SETTEXT to a window using SendMessageA, Windows actually expects a PAnsiChar even if the target window is a UNICODE window. So caling SendMessageA with PChars causes no problems.

A problem in the VCL has to do with the TControl.Perform() method. Perform() calls the window procedure directly and assumes an ANSI window. This is a problem if, for example, the VCL calls Perform(WM_SETTEXT, ...) passing in a PAnsiChar which eventually gets passed downto DefWindowProcW() which expects a PWideChar.

This is the reason for SubClassUnicodeControl(). This procedure will subclass the Windows WndProc, and the TWinControl.WindowProc pointer. It will determine if the message came from Windows or if the WindowProc was called directly. It will then call SendMessageA() for Windows to perform proper conversion on certain text messages.

Another problem has to do with TWinControl.DoKeyPress(). It is called from the WM_CHAR message. It casts the WideChar to an AnsiChar, and sends the resulting character to DefWindowProc. In order to avoid this, the DefWindowProc is subclassed as well. WindowProc will make a WM_CHAR message safe for ANSI handling code by converting the char code to #FF before passing it on. It stores the original WideChar in the .Unused field of TWMChar. The code #FF is converted back to the WideChar before passing onto DefWindowProc.


Do

const MyString = WideString('Teksts latvie'#$0161'u valod'#$0101);


Simple, the idea is to find a non-visual component, which can store text and store your text there. Prefer that such component can also provide you an editor to edit the text in design time.

There is a component call FormResource which can do this. I use TUniScript. I believe there are other similar components. However, I did not find a usable component from the standard library.


The approach Widestring(#$65E5#$672C) does not work, because Delphi 7 just doesn't expect more than one byte for the #, so the outcome is by far not what you expect when going above 255 or $FF.

Another approach WideChar($65E5)+ WideChar($672C) can be used to store single Unicode codepoints in your source code when knowing that you need to have a Widestring at the start of the assignment (which can also be an empty literal) so the compiler understands which datatype you want:

const
  // Compiler error "Imcompatible types"
  WONT_COMPILE: WideChar($65E5)+ WideChar($672C);

  // 日本
  NIPPON: Widestring('')+ WideChar($65E5)+ WideChar($672C);

Looks cumbersome, but surely has your UTF-16 texts in Delphi 7.

Alternatively, store your constants in UTF-8, which is ASCII safe - that way you can use # easily. One advantage is, that it's a lot less cumbersome to write in your source code. One disadvantage is, that you can never use the constant directly, but have to convert it to UTF-16 first:

const
  // UTF-8 of the two graphemes 日 and 本, needing 3 bytes each
  NIPPON: #$E6#$97#$A5#$E6#$9C#$AC;
var
  sUtf16: Widestring;
begin
  // Internally these are 2 WORDs: $65E5 and $672C
  sUtf16:= UTF8ToWideString( NIPPON );
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜