UnicodeDecodeError with Django's request.FILES
开发者_运维问答I have the following code in the view call..
def view(request):
body = u""
for filename, f in request.FILES.items():
body = body + 'Filename: ' + filename + '\n' + f.read() + '\n'
On some cases I get
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 7470: ordinal not in range(128)
What am I doing wrong? (I am using Django 1.1.)
Thank you.
Django has some utilities that handle this (smart_unicode, force_unicode, smart_str). Generally you just need smart_unicode.
from django.utils.encoding import smart_unicode
def view(request):
body = u""
for filename, f in request.FILES.items():
body = body + 'Filename: ' + filename + '\n' + smart_unicode(f.read()) + '\n'
you are appending f.read() directly to unicode string, without decoding it, if the data you are reading from file is utf-8 encoded use utf-8, else use whatever encoding it is in.
decode it first and then append to body e.g.
data = f.read().decode("utf-8")
body = body + 'Filename: ' + filename + '\n' + data + '\n'
Anurag's answer is correct. However another problem here is you can't for certain know the encoding of the files that users upload. It may be useful to loop over a tuple of the most common ones till you get the correct one:
encodings = ('windows-xxx', 'iso-yyy', 'utf-8',)
for e in encodings:
try:
data = f.read().decode(e)
break
except UnicodeDecodeError:
pass
If you are not in control of the file encoding for files that can be uploaded , you can guess what encoding a file is in using the Universal Encoding Detector module chardet
.
精彩评论