Spanish text in .py files

2023-03-09 13:25 问答作者：

This is the code

A = "Diga sí por cualquier número de otro cuidador.".encode("utf-8")

I get this error:

'ascii' codec can't decode byte 0xed in position 6: ordinal not in range(128)

I tried numerous开发者_StackOverflow社区 encodings unsuccessfully.

Edit:

I already have this at the beginning

# -*- coding: utf-8 -*-

Changing to

A = u"Diga sí por cualquier número de otro cuidador.".encode("utf-8")

doesn't help

Are you using Python 2?

In Python 2, that string literal is a bytestring. You're trying to encode it, but you can encode only a Unicode string, so Python will first try to decode the bytestring to a Unicode string using the default "ascii" encoding.

Unfortunately, your string contains non-ASCII characters, so it can't be decoded to Unicode.

The best solution is to use a Unicode string literal, like this:

A = u"Diga sí por cualquier número de otro cuidador.".encode("utf-8")

Error message: 'ascii' codec can't decode byte 0xed in position 6: ordinal not in range(128)

says that the 7th byte is 0xed. This is either the first byte of the UTF-8 sequence for some (maybe CJK) high-ordinal Unicode character (that's absolutely not consistent with the reported facts), or it's your i-acute encoded in Latin1 or cp1252. I'm betting on the cp1252.

If your file was encoded in UTF-8, the offending byte would be not 0xed but 0xc3:

Preliminaries:
>>> import unicodedata
>>> unicodedata.name(u'\xed')
'LATIN SMALL LETTER I WITH ACUTE'
>>> uc = u'Diga s\xed por'

What happens if file is encoded in UTF-8:
>>> infile = uc.encode('utf8')
>>> infile
'Diga s\xc3\xad por'
>>> infile.encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128)
#### NOT the message reported in the question ####

What happens if file is encoded in cp1252 or latin1 or similar:
>>> infile = uc.encode('cp1252')
>>> infile
'Diga s\xed por'
>>> infile.encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 6: ordinal not in range(128)
#### As reported in the question ####

Having # -*- coding: utf-8 -*- at the start of your code does not magically ensure that your file is encoded in UTF-8 -- that's up to you and your text editor.

Actions:

save your file as UTF-8.
As suggested by others, you need u'blah blah'

put on first line of your code this:

# -*- coding: utf-8 -*-

You should specify your source file's encoding by adding the following line to the very beginning of your code (assuming that your file is encoded in UTF-8):

# Encoding: UTF-8

Otherwise, Python will assume an ASCII encoding and fail during parsing.

You probably operate on normal string, not unicode string:

>> type(u"zażółć gęślą jaźń")
-> <type 'unicode'>

>> type("zażółć gęślą jaźń")
-> <type 'str'>

u"Diga sí por cualquier número de otro cuidador.".encode("utf-8")

should work.

If you want use unicode strings by default, put

# -*- coding: utf-8 -*-

in the first line of your script.

Look also in docs.

P.S. It's Polish in examples above :)

In the first or second line of your code, type the comment:

    # -*- coding: latin-1 -*-

For a list of symbols supported see: http://en.wikipedia.org/wiki/Latin-1_Supplement_%28Unicode_block%29

And the languages covered: http://en.wikipedia.org/wiki/ISO_8859-1

Maybe this is what you want to do:

A = 'Diga sí por cualquier número de otro cuidador'.decode('latin-1')

And don't forget to add # -*- coding: latin-1 -*- at the beginning of your code.

继续阅读：character-encoding python

Spanish text in .py files

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？