开发者

GAE blobstore filename UTF-8 encoding problem

I have some filename encoding problem in GAE blobstore here.

class UploadHandler(blobstore_handlers.BlobstoreUploadHandler):
    def post(self):
       开发者_如何学JAVA upload_files = self.get_uploads('file') 
        blob_info = upload_files[0]

        #Problem right here    
        decoded_filename = blob_info.filename.decode("utf-8")
        #

        File_info = Fileinfo(
            key_name=str(blob_info.key()),
            filename=decoded_filename,
            )
        File_info.put()
        self.redirect("/")

When I run in local, it function normal in SDK console,

GAE blobstore filename UTF-8 encoding problem

but after upload to GAE it store it shows like non-decode string "=?UTF-8?B?54Wn54mH5pel5pyfIDIwMTAtMDgtMDM=?=" or =?Big5?B?v8O59afWt9MgMjAxMC0xMi0wMiA=?=

GAE blobstore filename UTF-8 encoding problem

I doubt the best solution might be, stop using Chinese character filename ...

All suggestions are very welcome :)


It's an open issue: Blobstore handler breaking data encoding, check here.


The filename of BlobInfo is MIME-encoded by Google. I do not know why Google is doing so.

It is broken for the people living in multi-byte countries.

You can get a correct filename, if you using any character code, as below:

import email

for blob_info in self.get_uploads('file'):
  filename_mime = blob_info.filename
  if isinstance(filename_mime, unicode):
    filename_mime_utf8 = filename_mime.encode('utf-8')
  else:
    filename_mime_utf8 = filename_mime
  filename_encoded, encoding = email.header.decode_header(filename_mime_utf8)[0]
  if encoding is not None:
    filename_unicode = filename_encoded.decode(encoding)
    filename_utf8 = filename_unicode.encode('utf-8')
    blob_info._BlobInfo__entity['filename'] = filename_utf8


Here is a tweak to ENDOH takanao solution, which you can call on each file_info object:

def get_filename_from_file_info(file_info):
    filename_mime = file_info.filename
    if isinstance(filename_mime, unicode):
        filename_mime_utf8 = filename_mime.encode('utf-8')
    else:
        filename_mime_utf8 = filename_mime
    filename_encoded, encoding = email.header.decode_header(filename_mime_utf8)[0]
    if encoding is not None:
        filename_unicode = filename_encoded.decode(encoding)
        filename_utf8 = filename_unicode.encode('utf-8')
        return filename_utf8
    return filename_mime_utf8
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜