python appengine form-posted utf8 file issue
i am trying to form-post a sql file that consists on many INSERTS, eg.
INSERT INTO `TABLE` VALUES ('abcdé', 2759);
then i use re.search to parse it and extract the fields to put into my own datastore. The problem is that, although the file contains accented characters (s开发者_Go百科ee the e is a é), once uploaded it loses it and either errors or stores a bytestring representation of it.
Heres what i am currently using (and I have tried loads of alternatives):
form = cgi.FieldStorage()
uFile = form['sql']
uSql = uFile.file.read()
lineX = uSql.split("\n") # to get each line
and so on.
has anyone got a robust way of making this work? remember i am on appengine so access to some libraries is restricted/forbidden
You mention utf8
in the Q's title but then never again: what are you doing (in terms of setting headers and checking them) to verify what encoding is in use? There should be headers of the form
Content-Type: text/plain; charset=utf-8
and the charset=
part is where the encoding is specified. So what are the values upon sending and receiving this? If charset
is erroneous, you may have to manually perform some encoding and decoding. To help us gauge what the encoding seems to be, besides the headers, what's the ord value of that accented-e? E.g., if the encoding was actually iso-8859-1, that ord value would be 233 (in decimal; 0xE9 in hex).
精彩评论