开发者

Read special charatters ( æ ø å ) with Java from Oracle database

i have a problem when reading special charatters from oracle database (use JDBC driver and glassfish tooplink).

I store on data开发者_运维知识库base the name "GRØNLÅEN KJÆTIL" through WebService and, on database, the data are store correctly.

But when i read this String, print on log file and convert this in byte array whit this code:

 int pos = 0;
 byte[] msg=new byte[1024];

 String F = "F" + passenger.getName();
 logger.debug("Add " + F + " " + F.length());
 msg = addStringToArrayBytePlusSeparator(msg, F,pos);

..............

private byte[] addStringToArrayBytePlusSeparator(byte[] arrDest,String strToAdd,int destPosition)
    {
        System.arraycopy(strToAdd.getBytes(Charset.forName("ISO-8859-1")), 0, arrDest, destPosition, strToAdd.getBytes().length);

        arrDest = addSeparator(arrDest,destPosition+strToAdd.getBytes().length,1);

        return arrDest;
    }

1) In the log file there is:"Add FGRÃNLÃ " (the name isn't correct and the F.length() are not printed).

2) The code throw: java.lang.ArrayIndexOutOfBoundsException at java.lang.System.arraycopy(Native Method) at it.edea.ebooking.business.chi.control.VingCardImpl.addStringToArrayBytePlusSeparator(Test.java:225).

Any solution?

Tanks


You're calling strToAdd.getBytes() without specifying the character encoding, within the System.arraycopy call - that will be using the system default encoding, which may well not be ISO-8859-1. You should be consistent in which encoding you use. Frankly I'd also suggest that you use UTF-8 rather than ISO-8859-1 if you have the choice, but that's a different matter.

Why are you dealing with byte arrays anyway at this point? Why not just use strings?

Also note that your addStringToArrayBytePlusSeparator method doesn't give any indication of how many bytes it's copied, which means the caller won't have any idea what to do with it afterwards. If you must use byte arrays like this, I'd suggest making addStringToArrayBytePlusSeparator return either the new "end of logical array" or the number of bytes copied. For example:

private static final Charset ISO_8859_1 = Charset.forName("ISO-8859-1");

/**
 * (Insert fuller description here.)
 * Returns the number of bytes written to the array
 */
private static int addStringToArrayBytePlusSeparator(byte[] arrDest,
                                              String strToAdd,
                                              int destPosition)
{
    byte[] encodedText = ISO_8859_1.getBytes(strToAdd);
    // TODO: Verify that there's enough space in the array 

    System.arraycopy(encodedText, 0, arrDest, destPosition, encodedText.length);

    return encodedText.length;
}


Encoding/Decoding problems are hard. In every process step you have to do the correct encoding/decoding. So,

  1. familiarize yourself with the difference of bytes (inputstream) and Characters (Readers, Strings)
  2. Choose in which character encoding you want to store your data in the database, and in which character encoding you want to expose your webservice. Make sure when you load initial data in the database it's in the right encoding
  3. connect with the right database properties. mysql requires an addition to the connection url:?useUnicode=true&characterEncoding=UTF-8 when using UTF-8, I don't know about oracle.
  4. if you print/debug at a certain step and it looks ok, you can't be sure you did it right. The logger can write with the wrong encoding (sometimes making something look ok, while in fact it's broken). Your terminal might not handle strange byte encodings correct. The same holds for command-line database clients. Your data might wrongly be stored, but your wrongly configured terminal interprets/shows the data as correct.
  5. In XML, it's not only the stream encoding that matters, but also the xml-encoding attribute.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜