开发者

Perl, unicode and locales: how to process string in non-utf8 locale using `perl -p -i -e'?

setopt rcquotes
zsh -c 'export LANG="ru_RU.CP1251"; echo "Русский текст" | iconv -f utf8 | perl -p -i -e ''BEGIN{use open ":locale"}s/\p{InCyrillic}/й/g'''

gives me a bunch of errors:

"\x{00d0}" does not map to cp1251, <> line 1.
"\x{00b9}" does not map to cp1251, <> line 1开发者_JAVA技巧.

What should be done in order not to get this errors (note that locale may be any).


You forgot to denote the encoding of the substitution text. Update: In the first revision, I had a solution involving the nasty encoding pragma. It can be completely avoided, but the standard way as below did not come to my mind until now for some reason.

bash> export LANG=ru_RU.koi8r   # I do not have CP…

bash> echo "Русский текст" | iconv -f UTF-8 | hex
0000  f2 d5 d3 d3 cb c9 ca 20  d4 c5 cb d3 d4 0a        �������  �����.

bash> echo "Русский текст" | iconv -f UTF-8 | perl -p -i -e'BEGIN {use open ":locale"}; use utf8; s/\p{InCyrillic}/й/g' | hex
0000  ca ca ca ca ca ca ca 20  ca ca ca ca ca 0a        �������  �����.

bash> echo "Русский текст" | iconv -f UTF-8 | perl -p -i -e'BEGIN {use open ":locale"}; use utf8; s/\p{InCyrillic}/й/g' | iconv -t UTF-8
ййййййй ййййй
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜