开发者

shift jis decoding/encoding in perl

When I try decode a shift-jis encoded string and encode it back, some of the characters get garbled: I have following code:

use Encode qw(decode encode);
$val=;
print "\nbefore decoding: $val";
my $ustr = Encode::decode("shiftjis",$val);
print "\nafter decoding: $ustr";
print "\nbefore encoding: $ustr";
$val = Encode::encode("shiftjis",$ustr);
print "\nafter encoding: $val";

when I use a string : helloソworld in input it gets properly decoded and encoded back,i.e. before decoding and after encoding prints in above code print the same value. But when I tried another string like : ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩ

The end output got garbled.

Is it a perl library specific problem or it is a general shift jis mapping problem? Is there any solution for开发者_开发问答 it?


You should simply replace the shiftjis with cp932.

http://en.wikipedia.org/wiki/Code_page_932


You lack error-checking.

use utf8;
use Devel::Peek qw(Dump);
use Encode qw(encode);

sub as_shiftjis {
    my ($string) = @_;
    return encode(
        'Shift_JIS',    # http://www.iana.org/assignments/character-sets
        $string,
        Encode::FB_CROAK
    );
}

Dump as_shiftjis 'helloソworld';
Dump as_shiftjis 'ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩ';

Output:

SV = PV(0x9148a0) at 0x9dd490
  REFCNT = 1
  FLAGS = (TEMP,POK,pPOK)
  PV = 0x930e80 "hello\203\\world"\0
  CUR = 12
  LEN = 16
"\x{2160}" does not map to shiftjis at …
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜