开发者

python appengine form-posted utf8 file issue

i am trying to form-post a sql file that consists on many INSERTS, eg.

INSERT INTO `TABLE` VALUES ('abcdé', 2759);

then i use re.search to parse it and extract the fields to put into my own datastore. The problem is that, although the file contains accented characters (s开发者_Go百科ee the e is a é), once uploaded it loses it and either errors or stores a bytestring representation of it.

Heres what i am currently using (and I have tried loads of alternatives):

form = cgi.FieldStorage()
uFile = form['sql']
uSql = uFile.file.read()
lineX = uSql.split("\n") # to get each line

and so on.

has anyone got a robust way of making this work? remember i am on appengine so access to some libraries is restricted/forbidden


You mention utf8 in the Q's title but then never again: what are you doing (in terms of setting headers and checking them) to verify what encoding is in use? There should be headers of the form

Content-Type: text/plain; charset=utf-8

and the charset= part is where the encoding is specified. So what are the values upon sending and receiving this? If charset is erroneous, you may have to manually perform some encoding and decoding. To help us gauge what the encoding seems to be, besides the headers, what's the ord value of that accented-e? E.g., if the encoding was actually iso-8859-1, that ord value would be 233 (in decimal; 0xE9 in hex).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜