开发者

Malformed UTF-8 character error in regular expression in Perl

I have 'Malformed UTF-8 character' error when I'm putting some scalar data in XML::Simple or Data::Dumper. There are regular expressions on the lines where the error occurs.

Malformed UTF-8 character (fatal) at /usr/share/perl5/XML/Simple.pm line 1690.
Malformed UTF-8 character (fatal) at /usr/lib/perl/5.10/Data/Dumper.pm line 682.

At the moment I faile开发者_运维百科d to reproduce the error with a small piece of code.

XML::Simple 2.18
Data::Dumper 2.124
perl v5.10.1


The problem arose because somewhere deep in the code of the application there was Encode::_utf8_on with a scalar, that wasn't a proper UTF-8 string.


You could try piping your data through Encoding::FixLatin. If the 'binary' bytes you're encountering are actually Latin-1 characters then they'll get converted to valid UTF8. If they really are random binary bytes then they should at least get converted to random (but valid) UTF8 characters :-)


The core Encode module provides facilities for Handling Malformed Data. I never used them myself, though.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜