Delphi: TStringList does not understand BOM?
Does TStringList not understand BOM?
Tf1 := TFileStream.Create(LIGALOG+'liga.log',fmOpenRead or fmShareDenyNone);
str:=tstringlist.Create;
str.LoadFromStream(tf1);
String1:='FStream '+inttostr(tf开发者_如何转开发1.Size)+'/ String: '+(str.Text);
If a text file is saved in UTF-8 +BOM then Str.Count=0; Str.Text=''
. Without BOM all is OK.
If you're using a version of Delphi prior to 2009, it doesn't support Unicode and the BOM is meaningless to TStringList.
If you're using D2009 or higher (which support Unicode), you can use the overloaded TStringList.LoadFromStream(Stream: TStream; Encoding: TEncoding)
if you know ahead of time what the encoding is; if you don't, the RTL will try to figure it out using TEncoding.GetBufferEncoding
. You can see the Delphi XE documentation on the topic here
If you don't know ahead of time, and the RTL isn't able to figure it out from the content, you can always read the BOM yourself from the stream, and then set the Stream.Position
to just after the BOM and load the TStringList
from that position with the decoding you determine yourself from that BOM.
Also, creating a TFileStream
simply to then load into a TStringList
is a waste; TStringList.LoadFromFile
will handle the file itself, and is a lot less code if that's all you're going to do with the TStream
.
EDIT: After your comment, I thought I'd include a list of the BOMs I'm familiar with - there may be more I'm not aware of:
$00 $00 $FE $FF UTF-32, big-endian (bytes must be swapped for Windows)
$FE $FF $00 $00 UTF-32, little-endian
$FF $FE UTF-16 2 byte chars little-endian
$FE $FF UTF-16 2 byte big-endian
$EF $BB $BF Unicode UTF-8 (must be decoded before using Unicode data)
(For future reference: You should indicate in either the tags or the text of your question which version of Delphi you're using, as there are differences in the VCL and RTL between them. When it comes to things like Unicode/BOM type questions, these differences are extremely important.)
精彩评论