Perl, unicode and locales: how to process string in non-utf8 locale using `perl -p -i -e'?
setopt rcquotes
zsh -c 'export LANG="ru_RU.CP1251"; echo "Русский текст" | iconv -f utf8 | perl -p -i -e ''BEGIN{use open ":locale"}s/\p{InCyrillic}/й/g'''
gives me a bunch of errors:
"\x{00d0}" does not map to cp1251, <> line 1.
"\x{00b9}" does not map to cp1251, <> line 1开发者_JAVA技巧.
What should be done in order not to get this errors (note that locale may be any).
You forgot to denote the encoding of the substitution text. Update: In the first revision, I had a solution involving the nasty encoding
pragma. It can be completely avoided, but the standard way as below did not come to my mind until now for some reason.
bash> export LANG=ru_RU.koi8r # I do not have CP…
bash> echo "Русский текст" | iconv -f UTF-8 | hex
0000 f2 d5 d3 d3 cb c9 ca 20 d4 c5 cb d3 d4 0a ������� �����.
bash> echo "Русский текст" | iconv -f UTF-8 | perl -p -i -e'BEGIN {use open ":locale"}; use utf8; s/\p{InCyrillic}/й/g' | hex
0000 ca ca ca ca ca ca ca 20 ca ca ca ca ca 0a ������� �����.
bash> echo "Русский текст" | iconv -f UTF-8 | perl -p -i -e'BEGIN {use open ":locale"}; use utf8; s/\p{InCyrillic}/й/g' | iconv -t UTF-8
ййййййй ййййй
精彩评论