Encoding cp1252
When I try the following in Java:
System.out.println(System.getProperty("file.encoding"));
I get cp1252
as the encoding.
Is there a way to know where this value is coming from? (Like Environment variables or something)
I would like to print the value of encoding on command prompt u开发者_StackOverflow中文版sing some command like systeminfo on Windows XP.
cp1252 is the default encoding on English installations of MS Windows (what Microsoft refers to as ANSI). Java by default will take the system locale as its default character encoding. What this means is system dependent. In general I don't like to rely on default encodings. If I know my text will be pure ASCII I ignore it - otherwise I set the encoding explicitly when instantiating InputStreamReader
, OutputStreamWriter
, String
etc or calling getBytes
.
Note that cp1252 is not the default encoding on the Windows command prompt. That is the even older cp437, which you can see (and change) using the chcp
command.
That value is, on Windows at least, the legacy codepage used for non-Unicode text. It's what the OS converts strings to and from when you use the old ANSI APIs. For any newer program it should have no effect (that being said, I still see enough programs that use the A and not the W variants of API functions, sadly).
For you Java program none of that should matter, as Java uses Unicode exclusively. If you want to write or read text files in the system's codepage, then you'll need it, however.
For the command prompt, however, that encoding is of no significant value, as the console by default uses the OEM encoding which mimics the one of the DOS ages (850 or 437 is pretty common).
Since this doesn't really have anything to do with Java, you could just opt to use a WSH script:
' save this script as printANSI.vbs
' usage: cscript /Nologo printANSI.vbs
Set objShell = CreateObject("WScript.Shell")
cp = objShell.RegRead("HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001" &_
"\Control\Nls\CodePage\ACP")
WScript.Echo cp
See also the chcp
command; you may want to read up on how encoding works on the Windows command prompt (some links in this blog post).
As far as I have discovered, this is the encoding of your java source file, your output will change once you change its text file encoding. On eclipse, change it from Resource property (Alt+Enter or Right click on that file, go to Resource). Change text file encoding from cp1252 to something else, say UTF-8, woo... Your output won't be cp1252 any longer..
I believe this encoding is set by the JVM so it wouldn't make sense to retrieve it from outside
精彩评论