shift jis decoding/encoding in perl
When I try decode a shift-jis encoded string and encode it back, some of the characters get garbled: I have following code:
use Encode qw(decode encode); $val=; print "\nbefore decoding: $val"; my $ustr = Encode::decode("shiftjis",$val); print "\nafter decoding: $ustr"; print "\nbefore encoding: $ustr"; $val = Encode::encode("shiftjis",$ustr); print "\nafter encoding: $val";
when I use a string : helloソworld in input it gets properly decoded and encoded back,i.e. before decoding and after encoding prints in above code print the same value. But when I tried another string like : ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩ
The end output got garbled.
Is it a perl library specific problem or it is a general shift jis mapping problem? Is there any solution for开发者_开发问答 it?
You should simply replace the shiftjis
with cp932
.
http://en.wikipedia.org/wiki/Code_page_932
You lack error-checking.
use utf8;
use Devel::Peek qw(Dump);
use Encode qw(encode);
sub as_shiftjis {
my ($string) = @_;
return encode(
'Shift_JIS', # http://www.iana.org/assignments/character-sets
$string,
Encode::FB_CROAK
);
}
Dump as_shiftjis 'helloソworld';
Dump as_shiftjis 'ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩ';
Output:
SV = PV(0x9148a0) at 0x9dd490
REFCNT = 1
FLAGS = (TEMP,POK,pPOK)
PV = 0x930e80 "hello\203\\world"\0
CUR = 12
LEN = 16
"\x{2160}" does not map to shiftjis at …
精彩评论