When uploading Arabic files in Spring, filename ends up with XML entities instead of Arabic glyphs
I am using Spring upload to upload files. When uploading an Arabic file and getting the original file name in the controller, I get something like:
المغفلين.png
I expect it to be:
开发者_Python百科المغفلين.png
Any ideas why this problem occur?
It's likely Spring which has transformed Unicode characters (at least, the non-ISO-8859-1 characters) into XML entities. This behaviour must be configureable somewhere in the Spring settings (or those of the web based MVC framework you're actually using in combination with Spring but didn't mention about). Since I don't do Spring, I can't go in detail about configuring this.
But if you can't figure it for ever, then you may consider to use Apache Commons Lang StringEscapeUtils#unescapeXml()
to manually unescape the XML entities into real Arabic glyphs.
String realFilename = StringEscapeUtils.unescapeXml(escapedFilename);
There is nothing wrong with that encoding. It means exactly the same as the name you gave it.
According to the XML standard character references can be in the form #&n;
where n
is a decimal ([0-9]+
) or hexademical (x[0-9a-fA-F]+
) number, referring to the Unicode code point of the character represented. Thus the file name in your question is valid XML.
In your case the first character ا
(equivalent to ا
) represents the Unicode symbol with decimal code point 1575, usually represented in hexidecimal as U+0627. This code point is described as the Arabic letter "alef".
The symbols are encoded from left-to-right even though it is Arabic (right-to-left) symbols being encoded, so the "alef" is on the left of the ASCII file name. It is up to the rendering engine (whatever that might be) to render the string as RTL.
My Java experience is very limited, so unfortunately I cannot point you at a built-in or Spring feature that will help you handle this, but it seems to be that your XML is not properly decoded (if I had to guess).
精彩评论