开发者

Issue encoding java->xls

This is not a pure java question and can also be related to HTML

I've written a java servlet that queries a database table and shows the result as a html table. The user can also ask to receive the result as an Excel sheet. Im creating the Excel sheet by printing the same html table, but with the content-type of "application/vnd.ms-excel". The Excel file is created fine. The problem is that the tables may contain non-english data so I want to use a UTF-8 encoding.

PrintWriter out = response.getWriter();
response.setContentType("application/vnd.ms-excel:ISO-8859-1");
//response.setContentType("application/vnd.ms-excel:UTF-8");
response.setHeader("cache-control", "no-cache");
response.setHeader("Content-Disposition", "attachment; filename=file.xls");
out.print(src);
out.flush();

The non-english characters appear as garbage (áéíóú)

Also I tried开发者_开发知识库 converting to bytes from String

byte[] arrByte = src.getBytes("ISO-8859-1");
String result = new String(arrByte, "UTF-8");

But I Still getting garbage, What can I do?. Thanks

UPDATE: if I open the excel file in notepad + + the type of file encoding is "UTF-8 without BOM", if I change the encoding to "UTF-8" and then open the file in Excel, the characters "áéíóú" look good.


Excel is a binary format, not a text format, so you should not need to set any encoding, since it simply doesn't apply. Whatever system you are using to build the excel file (e.g. Apache Poi) will take care of the encoding of text within the excel file.

You should not try to convert the recieved bytes to a string, just store them in a byte array or write them out to a file.

EDIT: from the comment, it doesn't sound as if you are using a "real" binary excel file, but a tab delimited text file (CSV). In that case, make sure you use consistent encoding, e.g UTF-8 throughout.

Also, before calling response.getWriter(), call setContentType first.

See HttpServletResponse.getPrintWriter()

EDIT: You can try writing the BOM. It's normally not required, but file format handling in Office is far from normal...

Java doesn't really have support for the BOM. You'll have to fake it. It means that you need to use the response outputStream rather than writer, since you need to write raw bytes (the BOM). So you change your code to this:

response.setContentType("application/vnd.ms-excel:UTF-8");
// set other headers also, "cache-control" etc..
OutputStream outputStream = response.getOutputStream();
outputStream.write(0xEF);   // 1st byte of BOM
outputStream.write(0xBB);
outputStream.write(0xBF);   // last byte of BOM
// now get a PrintWriter to stream the chars.
PrintWriter out = new PrintWriter(new OutputStreamWriter(outputStream,"UTF-8"));
out.print(src);


Do you get "garbage" when you print result to standard output?

Edit (code in code tags from the comment below): response.setContentType("application/vnd.ms-excel; charset=UTF-8")


Try using the ServletResponse.setCharacterEncoding(java.lang.String charset) method.

response.setCharacterEncoding("UTF-8");


I had the same issue.. i fixed it with using print() instead of write()

outputStream.print('\ufeff');
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜