get char value in java

2023-01-28 22:54 问答作者：

How can I get the UTF8 code of a char in Java ? I have the char 'a' and I want the value 97 I have the char 'é' and I want the value 233

here is a table for more values

I开发者_如何学C tried Character.getNumericValue(a) but for a it gives me 10 and not 97, any idea why?

This seems very basic but any help would be appreciated!

char is actually a numeric type containing the unicode value (UTF-16, to be exact - you need two chars to represent characters outside the BMP) of the character. You can do everything with it that you can do with an int.

Character.getNumericValue() tries to interpret the character as a digit.

You can use the codePointAt(int index) method of java.lang.String for that. Here's an example:

"a".codePointAt(0) --> 97
"é".codePointAt(0) --> 233

If you want to avoid creating strings unnecessarily, the following works as well and can be used for char arrays:

Character.codePointAt(new char[] {'a'},0)

Those "UTF-8" codes are no such thing. They're actually just Unicode values, as per the Unicode code charts.

So an 'é' is actually U+00E9 - in UTF-8 it would be represented by two bytes { 0xc3, 0xa9 }.

Now to get the Unicode value - or to be more precise the UTF-16 value, as that's what Java uses internally - you just need to convert the value to an integer:

char c = '\u00e9'; // c is now e-acute
int i = c; // i is now 233

This produces good result:

int a = 'a';
System.out.println(a); // outputs 97

Likewise:

System.out.println((int)'é');

prints out 233.

Note that the first example only works for characters included in the standard and extended ASCII character sets. The second works with all Unicode characters. You can achieve the same result by multiplying the char by 1. System.out.println( 1 * 'é');

Your question is unclear. Do you want the Unicode codepoint for a particular character (which is the example you gave), or do you want to translate a Unicode codepoint into a UTF-8 byte sequence?

If the former, then I recommend the code charts at http://www.unicode.org/

If the latter, then the following program will do it:

public class Foo
{
   public static void main(String[] argv)
   throws Exception
   {
      char c = '\u00E9';
      ByteArrayOutputStream bos = new ByteArrayOutputStream();
      OutputStreamWriter out = new OutputStreamWriter(bos, "UTF-8");
      out.write(c);
      out.flush();
      byte[] bytes = bos.toByteArray();
      for (int ii = 0 ; ii < bytes.length ; ii++)
         System.out.println(bytes[ii] & 0xFF);
   }
}

(there's also an online Unicode to UTF8 page, but I don't have the URL on this machine)

My method to do it is something like this:

char c = 'c';
int i = Character.codePointAt(String.valueOf(c), 0);
// testing
System.out.println(String.format("%c -> %d", c, i)); // c -> 99

You can create a simple loop to list all the UTF-8 characters available like this:

public class UTF8Characters {
    public static void main(String[] args) {
        for (int i = 12; i <= 999; i++) {
            System.out.println(i +" - "+ (char)i);
        }
    }
}

There is an open source library MgntUtils that has a Utility class StringUnicodeEncoderDecoder. That class provides static methods that convert any String into Unicode sequence vise-versa. Very simple and useful. To convert String you just do:

String codes = StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence(myString);

For example a String "Hello World" will be converted into

"\u0048\u0065\u006c\u006c\u006f\u0020 \u0057\u006f\u0072\u006c\u0064"

It works with any language. Here is the link to the article that explains all te ditails about the library: MgntUtils. Look for the subtitle "String Unicode converter". The article gives you link to Maven Central where you can get artifacts and github where you can get the project itself. The library comes with well written javadoc and source code.

继续阅读：character-encoding

get char value in java

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？