开发者

unicode error in python [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. Closed 11 years ago.

In the below code i get an error at mailServer.sendmail(gmailUser, m.to_addr, msg.as_string())

 2011-08-12 17:33:02,542 ERROR  send exception


  Traceback (most recent call last):
    File "sendmail.py", line 33, in bulksend
      mailServer.sendmail(gmailUser, m.to_addr, msg.as_string()).replace(u'\xa0', '')
    File "/usr/lib/python2.4/email/Message.py", line 129, in as_string
      g.flatten(self, unixfrom=unixfrom)
    File "/usr/lib/python2.4/email/Generator.py", line 82, in flatten
      self._write(msg)
    File "/usr/lib/python2.4/email/Generator.py", line 113, in _write
      self._dispatch(msg)
    File "/usr/lib/python2.4/email/Generator.py", line 139, in _dispatch
      meth(msg)
    File "/usr/lib/python2.4/email/Generator.py", line 205, in _handle_multipart
      g.flatten(part, unixfrom=False)
    File "/usr/lib/python2.4/email/Generator.py", line 82, in flatten
      self._write(msg)
    File "/usr/lib/python2.4/email/Generator.py", line 113, in _write
      self._dispatch(msg)
    File "/usr/lib/python2.4/email/Generator.py", line 139, in _dispatch
      meth(msg)
    File "/usr/lib/python2.4/email/Generator.py", line 182, in _handle_text
      self._fp.write(payload)
  UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 173: ordinal not in range(128)
  o

This is the send method:

def send(request)
    qs = "......."
    if qs.count():
        smaid = qs[0].id
        gmailUser = 'no-reply@xx.com'
        gmailPassword = 'xx'
        mailServer = smtplib.SMTP('smtp.gmail.com', 587)
        mailServer.ehlo()
        mailServer.starttls()
        mailServer.ehlo()
        mailServer.login(gmailUser, gmailPasswo开发者_运维问答rd)
        tosend = MailQueue.objects.filter(school = smaid, send = 0)
        for m in tosend:
            msg = MIMEMultipart()
            msg['From'] = gmailUser
            msg['To'] = m.to_addr
            msg["Content-type"] = "text/html"
            sub = m.subject
            sub = sub.replace(u"\u2019"," ")
            msg['Subject'] = sub
            body = m.body
            body = body.replace(u"\u2019"," ")
            msg.attach(MIMEText(body, 'html'))
            mailServer.sendmail(gmailUser, m.to_addr, msg.as_string())
            m.send = 1
            m.save()
        mailServer.close()
    except:
    write_exception("send exception")


First, you've got a bug that hasn't been triggered in the line before the sendmail call. MIMEText defaults to trying to use the ASCII charset. This obviously won't work for unicode. You'd think it would default to using utf-8 if passed non-ASCII unicode, but it doesn't. (I consider this a bug, but it is too late to fix it in Python2). So your code should tell MIMEText which charset to use:

msg.attach(MIMEText(body, 'html', 'utf-8'))

But your error is coming after the MIMEText step, which indicates it is probably unicode in your headers. As mentioned, you can't send unicode to SMTP. But the answer is not to encode it to utf-8. You can't send utf-8 over SMTP in headers either (only in bodies).

To properly content-transfer-encode unicode in headers, use the Header class (email.header.Header):

msg['Subject'] = Header(sub, header_name='Subject')

Yes, this is a pain. It is also a bit ugly since it CTE encodes the entire header, instead of just the parts that are non-ASCII. We're working on making this work easier and better in Python3, but we aren't quite there yet.

Addresses with unicode in them are more complicated. You have to use Header to encode the display name, then pass that to formataddr:

disp_name = u'some unicode string'
addr = 'some@address.example.com'
msg['To'] = formataddr((str(Header(disp_name)), addr))

This address trick is not documented. Many Python emall programs call Header on the entire address header, but this produces RFC invalid results (fortunately a lot of mailers handle the decoding correctly anyway).

All of this should be much better in Python 3.3.


SMTP does not understand unicode. You have to encode the headers, and message body to byte strings before passing them to SMTPLIB.

I would recommend you to use marrow.mailer instead of rolling your own. marrow.mailer already encodes everything for you, even Internationalized Domain Names.

https://github.com/marrow/marrow.mailer


It looks to me like when you do msg.as_string(), the library is eventually writing it to a file-like object, which is where the error is thrown. It's likely that the object is expecting an ASCII encoding, so Unicode is not supported.


It's the msg.as_string() that fails. It fails because the body is unicode (instead of str) and contains some codepoints above 128.

To fix this, make sure the body in msg.attach(MIMEText(body, 'html')) is of type str, probably by encoding it.

msg.attach(MIMEText(body.encode('utf-8'), 'html'))

You will have to set the encoding too. It think it would be:

msg["Content-type"] = "text/html;charset=utf-8"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜