FTP filename encoding
Hi I use twisted library to connect to FTP server but I have problem with filename encoding. I receive '开发者_如何学CIllusion-N\xf3z.txt' so its not unicode. Is there any FTP command to force specific encoding? Thanks in advance! MK
There are two possibilities:
- FTP is not unicode aware. It looks like the server you're talking to in this example is sending Latin-1 encoded bytes. So you need to decode the bytes using that encoding when you receive them.
- There is an RFC which updates FTP to be UTF-8-aware. Check the results of the
FEAT
command to see ifUTF8
is there (but it probably isn't, since the example bytes are not valid UTF-8). If it is, decode the bytes using UTF-8.
Twisted's FTP client won't do anything unicode-related for it, since it just implements the basic FTP RFC.
FTP ignores encodings; as long as a filename does not contain a '\0'
(null character) and '/'
(slash) separates directories, it happily accepts anything.
Do your own decoding and encoding of the filenames. It is quite probable that the encoding used in your example is "cp1252", which is the “Windows Western” or something like that.
In your case, when you receive 'Illusion-N\xf3z.txt', convert it to Unicode by 'Illusion-N\xf3z.txt'.decode('cp1252')
.
精彩评论