Arabic file names converted into question marks
I have a spring application in one of the forms the use supposed to upload an excel file.
The application saves the file开发者_运维知识库 on the hard desk and provide a link to the user to download it again.
If the file name is written in English every thing goes OK but if the file name contains Arabic characters, the file Arabic characters are converted into question marks.
It is clear that the problem is related to character encoding but I can not detect where is the problem exactly.
Here is the system structure and the configurations:
- Operating system : Centos
- Application server : Tomcat
connector configs in server.xml
[Connector port="8009" protocol="AJP/1.3" redirectPort="8443" URIEncoding="UTF-8"]
Go through these two pages:
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) - by Joel Spolsky
and
the Wikipedia page for the unicode block for Arabic
Maybe these will help...
You have to know what's char-set of Arabic character.
If you don't know, you can try with UTF-16.
The code to use is following:
// output stream
ByteArrayOutputStream bout = new ByteArrayOutputStream();
// input stream
InputStream in = new FileInputSteam("filePath");
// reading buffer
byte[] buffer = new byte[1024];
// 1st read
int bytes = in.read(buffer, 0, buffer.length());
while(bytes != -1) {
// write buffer
bout.write(buffer);
// re-load buffer
bytes = in.read(buffer, bytes, buffer.length());
}
String yourText = bout.toString(Charset.forName("YOUR_CHARSET"));
// close stream or use JSE7 try-catch-with-resource
in.close();
bout.close();
Enjoy yourself.
In Windows Control panel, go to Regional Option & in Administrative Tab, select in Langauge of Non Unicode programme, Select the regional Arabic Langauge.
I think its arabic lang not support your system's language so try this.
byte[] utf8Bytes = ("Arabic String").getBytes("arabic"); argument = new Object[]{new String(utf8Bytes,"UTF8")}; System.out.println(argument);
精彩评论