How do I suppress UTF-8 warnings in Perl?
Due to various reasons I'm getting the error messages Malformed UTF-8 character
and
Wide character in print
from a legacy script.
I would like to suppress/disable those two warnings so that they are not written to STD开发者_JAVA技巧ERR
.
How do I do that?
Presumably you are working in utf8. You have to turn on utf8 handling for each filehandle.
binmode STDERR, ":encoding(utf8)";
You can do this for all the standard handles with use open ":encoding(utf8)"
. See open for more info.
Finally, you can utf8-ify everything, your code, your filehandles and your arguments, with utf8::all.
Note that :utf8
turns on utf8 handling but :encoding(utf8)
checks the data is valid utf8, so it is safer. See perldoc -f binmode
for details.
no warnings 'utf8';
But it's best to figure out why you're getting the warning and fix the underlying problem. Those two warnings indicate something is going wrong in your script. Suppressing the warnings won't fix the error.
Here's two examples to help you understand the errors:
milu@ubuntu: ~/Milu/Dev/Perl > cat malformed-utf8-char.pl
use utf8; # script source must be in UTF-8
use strict;
use warnings;
print "K�se\n";
milu@ubuntu: ~/Milu/Dev/Perl > perl malformed-utf8-char.pl
Malformed UTF-8 character (unexpected non-continuation byte 0x73,
immediately after start byte 0xe4) at malformed-utf8-char.pl line 4.
Kse
The source is in Latin-1, my terminal is in UTF-8. The string is actually "Käse". The utf8
pragma must either be removed, or the source be saved in UTF-8.
milu@ubuntu: ~/Milu/Dev/Perl > cat wide-char-in-print.pl
use utf8;
use strict;
use warnings;
# binmode STDOUT, ':utf8';
print "Группа сайтов РИА Новости\n";
milu@ubuntu: ~/Milu/Dev/Perl > perl wide-char-in-print.pl
Wide character in print at wide-char-in-print.pl line 5.
Группа сайтов РИА Новости
The source contains Cyrillic characters, hence the utf8
pragma is in order. To print those characters to the terminal, however, STDOUT
must also be set to UTF-8, which you can achieve by calling binmode
. If you don't do this, a warning is triggered as a wide (Unicode beyond 0x255) character doesn't fit through a narrow (byte) output channel. It'll still look correct, because Perl will just output the bytes as they are, which then happens to look correct.
Had the same problem with debug output from log4perl using Perl on Windows Powershell Console
Wide character in print at C:/strawberry/perl/site/lib/Log/Log4perl/Appender/Screen.pm line 39.
The solutions was (in the log4perl config file)
log4perl.appender.Screen.utf8 = 1
精彩评论