decode encode between String and byte in java
byte[] bytes = new byte[] { 1, -1 };
System.out.println(Arrays.toString(new String(bytes, "UTF-8").getBytes("UTF-8")));
System.out.println(Arrays.t开发者_StackOverflowoString(new String(bytes, "ISO-8859-1").getBytes("ISO-8859-1")));
output:
[1, -17, -65, -67]
[1, -1]
why???
Your byte array isn't a valid UTF-8-encoded string... so the string you get from
new String(bytes, "UTF-8")
contains U+0001 (for the first byte) and U+FFFD to signify bad data in the second byte. When that string is encoded using UTF-8, you get the byte pattern shown.
Basically you shouldn't try to interpret arbitrary binary data as if it were encoded in a particular encoding. If you want to represent arbitrary binary data as a string, use something like base64.
-1 is not a valid UTF-8 encoded character. [-17, -65, -67] is most likely the byte representation of the replacement character that gets substituted.
String isn't a container for binary data. It is a container for char. -1 isn't a legal value for a char. There's no reason why what you're doing should ever work. Ergo, don't do it.
精彩评论