Handling UTF-16 in a Django uploaded file
In my Django webapp, in one location users can upload a text file where each line contains a string which will be operated on - the file isn't being stored on the server or anything like that.
My code looks like this:
roFile = request.FILES['uploadFileName']
ros = roFile.read().strip()
ros = ros.split('\n')
ros = [t.strip() for t in ros]
To date, this has worked AOK. Today I had a user uploading a file which was causing issues. Using these strings in Django generates the following error:
ProgrammingError: ERROR: invalid byte sequence for encoding "UTF8":0xff
The user has told me that he saved the file as UTF-16.
Within python proper, I can do the following:
import codecs
from django.utils.encoding import *
fo = codecs.open('filename', 'r', 'utf-16')
zz = fo.readlines()
开发者_StackOverflow中文版and then the values seem to be manageable, but not with the file upload.
What is the appropriate way to deal with the data in request.FILES in order to handle the differing character set?
This first part doesn't answer your question (I know nothing about django); I'd just like to point out that when you supply code that you say works or doesn't work, you should copy/past the actual code that you ran,; don't type it from memory.
This code:
import codecs
from django.utils.encoding
f = codecs.open('filename', 'r', 'utf-16')
zz = fo.readlines()
has 2 problems and looks like it should be:
import codecs from django.utils.encoding
fo = codecs.open('filename', 'r', 'utf-16')
zz = fo.readlines()
To your question: google("django request files") seems to give some useful clues; have you investigated them, including this? One of the clues was that file uploads seem to have been improved in later versions of django; what version are you using?
精彩评论