how access file name with non english

2023-01-26 17:21 问答作者：

when dealing with non-english filename.

The problem is that my program cannot gurantee those directories and filenames are in English, if some filenames using japanese, chinese character it will display some ch开发者_运维百科aracter like '?'.

anybody can suggest me wat i need to do to access non english file name

The problem is that my program cannot guarantee those directories and filenames are in English. If a filename use japanese, chinese characters it will display some character like '?'.

The problem is apparently that "it" is using the wrong character set to display the filenames. The solution depends on whether "it" is your program (via a GUI), some other application, the command shell / terminal emulator, or the user's web browser. If you could provide more information, maybe I could offer some suggestions.

But turning the characters into underscores is most likely a bad solution. It is liable to lead to filename clashes, and those Chinese / Japanese / etc characters are most likely meaningful to the people who created the files.

_{By the way, the correct term for "english" letters is Latin.}

EDIT

For your use-case, you don't to store the PDF file using a filename that bears any relation to the supplied filename. I suggest that you try to solve the problem by using a filename consisting of Latin numbers and letters generated from (say) currentTimeInMillis(). If that fails, then your real problem has nothing to do with the filenames at all.

EDIT 2

You ask about the statement

if (fileName.startsWith("=?iso-8859"))

This seems to be trying to unpick a filename in MIME encoded-word format; see RFC 2047 Section 2

Firstly, I think that code may be unnecessary. The javadoc is not specific, but I think that the Part.getFilename() method should deal with decoding of the filename.

Second, if the decoding is necessary, then you are going about it the wrong way. The stuff after the charset cannot simply be treated as the value of the filename. Look at the RFC.

Third, if you need to you should use the relevant MimeUtility methods to decode "word" tokens ... like the filename.

Fourthly, ISO-8859-1 is NOT a suitable encoding for characters in non-Latin character sets.

Finally, examine the raw email headers of the emails that you are trying to decode and look for the header line that starts

Content-Disposition: attachment; filename=...

If the filename looks like "=?iso-8859-1?...", and the filename is supposed to contain japanese / chinese / etc characters, then the problem is in the client (or whatever) that constructed the email. The character set needs to be "utf-8" or one of the other multibyte character sets.

Java uses Unicode natively - you don't need to replace special characters, as Unicode has no special characters - every code point is treated equally. Your replaceSpChars() may be the culprit here.

继续阅读：unicode

how access file name with non english

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？