Java String conversion to hex

2022-12-17 11:40 问答作者：

I am working on a tcp/ip socket listener which listens on port 80 for data that arrives from remote hosts. Now these incoming data are in unreadable format and so i have saved this incoming data as they are in a string initially and then converted this string to a character array and then for every index in the array , I have converted the content to hex. Now the problem is that The data is getting converted to hex alright, but in some places the conversion is not proper and the resulting hex part is 'fffd'. is in the place where the resulting hex should be 'bc'(0xBC), it is 'fffd'(0xFF 0xFD). I am forced to believe that some pa开发者_如何学运维rts of the incoming data are not being read properly by my java program. Im using BufferefInputStream and InputStreamReader for reading the incoming data and am checking the end of stream in the following way.

  BufferedInputStream is = new BufferedInputStream(connection.getInputStream());
  InputStreamReader isr = new InputStreamReader(is);
  while(isr.read()!=-1)

 {
 ...
}

where 'connection' is a socket object.

The input data that im getting through the socket is #SR,IN-0002005,10:49:37,16/01/2010, $<49X ™™š@(bN>™™šBB ©: 4ä ýÕ 01300>ÀäCåKöA÷Ð›.

The hex conversion that my program does has 'fffd' at many places where other hex values should be. The conversion, though is correct for around 60% of the input string

Any pointers on why my resulting hex conversion is not what it should be would be of great help.

I don't think you should be using a reader. Readers are for reading characters, you seem to be working with binary data. Use the InputStream directly and transform the bytes as you receive them. chars in java are Unicode-characters, which I am guessing is the source of your issues.

Java Strings are not as easy to "abuse" for handling transparent binary data as it is in VB (or most other languages). VB treats strings internally as an array of bytes, while in Java, Strings are an ordered list of characters.

In your case, you wrap your InputStream with an InputStreamReader causing your platform's default character encoding to be used when converting the bytes delivered from the InputStream to characters delivered by the InputStreamReader. Some of the mostly used ISO 8859-X character sets are not using bytes in the ranges 0x00 to 0x1f and 0x7f to 0xbf, so if you are using such an encoding and reading a byte from those ranges, the InputStreamReader will return the "replacement character" with codepoint 0xfffd to indicate an unknown character.

The only "correct" way is to leave out the InputStreamReader and use byte arrays for the binary data.

When converting bytes to chars with an InputStreamReader, the encoding makes a huge difference:

  public static void main(String[] args) throws Exception {
    checkEncoding("ISO-8859-1");
    checkEncoding("ISO-8859-9");
    checkEncoding("Windows-1252");
    checkEncoding("UTF-8");
    checkEncoding("UTF-16BE");
    checkEncoding("Big5");
    checkEncoding("Shift-JIS");
  }

  private static void checkEncoding(String encoding) throws IOException {
    byte[] all = new byte[256];
    for ( int i = 0; i < all.length; ++i ) all[i] = (byte) i;
    ByteArrayInputStream bais = new ByteArrayInputStream(all);
    InputStreamReader isr = new InputStreamReader(bais, encoding);
    char[] ca = new char[256];
    int read = isr.read(ca);
    System.out.println(encoding + ":" + read);
    for ( int i = 0; i < read; ++i ) {
      if ( ca[i] != i ) {
        System.out.println(Integer.toHexString(i) + "->" + 
            Integer.toHexString(ca[i]));
      }
    }
  }

The only one that works "as expected" is ISO-8859-1, which is defined to be the first 256 chars in Unicode. ISO-8859-9 and Windows-1252 also produce chars 1-for-1; 8859-9 has a few different characters, but 1252 has several 0xFFFDs.

Because of the way the bytes are arranged, everything after 0x7F for UTF-8 is no good. Of course, you get half the chars for UTF-16, and the other multi-byte encodings are a mess.

For development purposes look at the one in Eclipse already for use with those web containers with server connectors.

继续阅读：string

Java String conversion to hex

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？