unicode error in python [closed]
In the below code i get an error at mailServer.sendmail(gmailUser, m.to_addr, msg.as_string())
2011-08-12 17:33:02,542 ERROR send exception
Traceback (most recent call last):
File "sendmail.py", line 33, in bulksend
mailServer.sendmail(gmailUser, m.to_addr, msg.as_string()).replace(u'\xa0', '')
File "/usr/lib/python2.4/email/Message.py", line 129, in as_string
g.flatten(self, unixfrom=unixfrom)
File "/usr/lib/python2.4/email/Generator.py", line 82, in flatten
self._write(msg)
File "/usr/lib/python2.4/email/Generator.py", line 113, in _write
self._dispatch(msg)
File "/usr/lib/python2.4/email/Generator.py", line 139, in _dispatch
meth(msg)
File "/usr/lib/python2.4/email/Generator.py", line 205, in _handle_multipart
g.flatten(part, unixfrom=False)
File "/usr/lib/python2.4/email/Generator.py", line 82, in flatten
self._write(msg)
File "/usr/lib/python2.4/email/Generator.py", line 113, in _write
self._dispatch(msg)
File "/usr/lib/python2.4/email/Generator.py", line 139, in _dispatch
meth(msg)
File "/usr/lib/python2.4/email/Generator.py", line 182, in _handle_text
self._fp.write(payload)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 173: ordinal not in range(128)
o
This is the send
method:
def send(request)
qs = "......."
if qs.count():
smaid = qs[0].id
gmailUser = 'no-reply@xx.com'
gmailPassword = 'xx'
mailServer = smtplib.SMTP('smtp.gmail.com', 587)
mailServer.ehlo()
mailServer.starttls()
mailServer.ehlo()
mailServer.login(gmailUser, gmailPasswo开发者_运维问答rd)
tosend = MailQueue.objects.filter(school = smaid, send = 0)
for m in tosend:
msg = MIMEMultipart()
msg['From'] = gmailUser
msg['To'] = m.to_addr
msg["Content-type"] = "text/html"
sub = m.subject
sub = sub.replace(u"\u2019"," ")
msg['Subject'] = sub
body = m.body
body = body.replace(u"\u2019"," ")
msg.attach(MIMEText(body, 'html'))
mailServer.sendmail(gmailUser, m.to_addr, msg.as_string())
m.send = 1
m.save()
mailServer.close()
except:
write_exception("send exception")
First, you've got a bug that hasn't been triggered in the line before the sendmail call. MIMEText defaults to trying to use the ASCII charset. This obviously won't work for unicode. You'd think it would default to using utf-8 if passed non-ASCII unicode, but it doesn't. (I consider this a bug, but it is too late to fix it in Python2). So your code should tell MIMEText which charset to use:
msg.attach(MIMEText(body, 'html', 'utf-8'))
But your error is coming after the MIMEText step, which indicates it is probably unicode in your headers. As mentioned, you can't send unicode to SMTP. But the answer is not to encode it to utf-8. You can't send utf-8 over SMTP in headers either (only in bodies).
To properly content-transfer-encode unicode in headers, use the Header class (email.header.Header):
msg['Subject'] = Header(sub, header_name='Subject')
Yes, this is a pain. It is also a bit ugly since it CTE encodes the entire header, instead of just the parts that are non-ASCII. We're working on making this work easier and better in Python3, but we aren't quite there yet.
Addresses with unicode in them are more complicated. You have to use Header to encode the display name, then pass that to formataddr:
disp_name = u'some unicode string'
addr = 'some@address.example.com'
msg['To'] = formataddr((str(Header(disp_name)), addr))
This address trick is not documented. Many Python emall programs call Header on the entire address header, but this produces RFC invalid results (fortunately a lot of mailers handle the decoding correctly anyway).
All of this should be much better in Python 3.3.
SMTP does not understand unicode. You have to encode the headers, and message body to byte strings before passing them to SMTPLIB.
I would recommend you to use marrow.mailer instead of rolling your own. marrow.mailer already encodes everything for you, even Internationalized Domain Names.
https://github.com/marrow/marrow.mailer
It looks to me like when you do msg.as_string()
, the library is eventually writing it to a file-like object, which is where the error is thrown. It's likely that the object is expecting an ASCII encoding, so Unicode is not supported.
It's the msg.as_string()
that fails. It fails because the body is unicode
(instead of str
) and contains some codepoints above 128.
To fix this, make sure the body in msg.attach(MIMEText(body, 'html'))
is of type str
, probably by encoding it.
msg.attach(MIMEText(body.encode('utf-8'), 'html'))
You will have to set the encoding too. It think it would be:
msg["Content-type"] = "text/html;charset=utf-8"
精彩评论