开发者

Python not opening Japanese filenames

I've been working on a python script to open up a file with a unicode name (Japanese mostly) and save to a randomly ge开发者_如何学JAVAnerated (Non-unicode) filename in Windows Vista 64-bit, and I'm having issues... It just doesn't work, it works fine with non-unicode filenames (Even if it has unicode content), but the second you try to pass a unicode filename in - it doesn't work.

Here's the code:

try:
    import sys, os
    inpath = sys.argv[1]
    outpath = sys.argv[2]
    filein = open(inpath, "rb")
    contents = filein.read()
    fileSave = open(outpath, "wb")
    fileSave.write(contents)
    fileSave.close()

    testfile = open(outpath + '.test', 'wb')
    testfile.write(inpath)
    testfile.close()

except:
    errlog = open('G:\\log.txt', 'w')
    errlog.write(str(sys.exc_info()))
    errlog.close()

And the error:

(<type 'exceptions.IOError'>, IOError(2, 'No such file or directory'), <traceback object at 0x01092A30>)


You have to convert your inpath to unicode, like this:

inpath = sys.argv[1]
inpath = inpath.decode("UTF-8")
filein = open(inpath, "rb")

I'm guessing you are using Python 2.6, because in Python 3, all strings are unicode by default, so this problem wouldn't happen.


My guess is that sys.argv1 and sys.argv[2] are just byte arrays and don't support natively Unicode. You could confirm this by printing them and seeing if they are the character you expect. You should also print type(sys.argv1) to make sure they are of the correct type.

Where do the command-line parameters come from? Do they come from another program or are you typing them on the command-line? If they come from another program, you could have the other program encode them to UTF-8 and then have your Python program decode them from UTF-8.

Which version of Python are you using?

Edit: here's a robust solution: http://code.activestate.com/recipes/572200/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜