开发者

DjangoUnicodeDecodeError and force_unicode

I've simple Django model of news entry:

class NewsEntry(models.Model):
    pub_date = models.DateTimeField('date published')
    title = models.CharField(max_length = 200)
    summary = models.TextField()
    content = models.TextField()

def __unicode__(self):
    return self.title

Adding new news (in Admin page) with english text works fine but when i try to add news with russian text there is error:

TemplateSyntaxError at /admin/news/newsentry/

Caught DjangoUnicodeDecodeError while rendering: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128). You passed in NewsEntry: [Bad Unicode data] (class 'antek.news.models.NewsEntry')

Django Version: 1.2.2

Exception Type: TemplateSyntaxError

Exception Value: Caught DjangoUnicodeDecodeError while rendering: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128). You passed in NewsEntry: [Bad Unicode data] (class 'antek.news.models.NewsEntry')

Exception Location: /usr/local/lib/python2.6/dist-packages/django/utils/encoding.py in >force_unicode, line 88

Python Version: 2.6.5

The last item in traceback list is:

/usr/local/lib/python2.6/dist-packages/django/utils/encoding.py in force_unicode

Local vars:

e: UnicodeDecodeError('ascii', '\xd0\xa2\xd0\xb5\xd1\x81\xd1\x82 \xd1\x80\xd1\x83\xd1\x81\xd1\x81\xd0\xba\xd0\xbe\xd0\xb3\xd0\xbe', 0, 1, 'ordinal not in range(128)')

Code looks correct: self.title is unicode object. Also, djangoproject.com use similar code in their blog application.

I spend much time to solve this problem and founded strange solution:

from django.utils.encoding import force_unicode
# ...
def __unicode__(self):
    return force_unico开发者_高级运维de(self.title)

But due to self.title is unicode object, force_unicode should return it without any changes.

Why return self.title doesn't work?


Problem was in utf8_bin collation in MySQL server. Full information here.


force_unicode comes with the potential of lost data. If you know the type of data that you are getting, it is much more realistic to simply use Python's decode method to properly convert the data. This can easily be done with a 'latin1' string (for example) like so:

my_unicode_string = my_latin1_string.decode('latin1')


My situation was even more paculiar, I was importing data from JSON file, the in memory created instance would throw a Unicode as follows:

DjangoUnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 21: ordinal not in range(128). You passed in <Company: [Bad Unicode data]> (<class 'companies.models.Company'>)

But retrieving it from database and running the code again worked without an issue, so if you having an issue where you get a Django error that contains [Bad Unicode data] try re-retrieving the object after save as a workaround.

    ...
company.save()
company = Company.objects.get(pk=company.pk) # avoiding bizarre [Bad Unicode data] error
logger.info("Company (locality exists) '{0}' created".format(company))
...

If anyone wishes to properly explain as to why feel free - my guess is the input data is not encoded in utf-8:

...
"address_city": "Dolbeau-Mistassini", 
"name": "Bleuets Mistassini Lt\u00e9e", 
...
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜