Are character encoding issue causing my Perl output to look like gibberish?
I'm running a Perl script (both with 5.8.4) on two different machines (one Solaris 5.10, the other OpenSolaris 5.11). The output of the two scripts differs in the following way:
Solaris 5.10
$ perl myscript.pl
is' £ ä º <ä ¼ sa ... ³ ä º žÃ ... ¬ å ¸ ç ¬ ¬ ä º ¤ § œâ is œâ ¡ä ¸ ‡ å ... æœ ¬ æœ ¬ å ¸ È, ¡ä »½ çš" å ... ¬ å'Š
OpenSolaris 5.11
$ perl myscript.pl
Board of directors about the companys second-largest shareholder of the notice by Chibengongsi shares
Basically, the chars on the Solaris box are messed up. The output of locale on both is identical.
OpenSolaris $ perl -V Summary of my perl5 (revision 5 version 8 subversion 4) configuration: Platform: osname=solaris, osvers=2.11, archname=i86pc-solaris-64int uname='sunos localhost 5.11 i86pc i386 i86pc' config_args='' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_TS_ERRNO', optimize='-xO3 -xspace -xildoff', cppflags='' ccversion='Sun WorkShop', gccversion='', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags ='' libpth=/lib /usr/lib /usr/ccs/lib libs=-lsocket -lnsl -ldl -lm -lc perllibs=-lsocket -lnsl -ldl -lm -lc libc=/lib/libc.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-R /usr/perl5/5.8.4/lib/i86pc-solaris-64int/CORE' cccdlflags='-KPIC', lddlflags='-G' Characteristics of this binary (from libperl): Compile-time options: USE_64_BIT_INT USE_LARGE_FILES Locally applied patches: 22667 The optree builder was looping when constructing the ops ... 22715 Upgrade to FileCache 1.04 22733 Missing copyright in the README. 22746 fix a coredump caused by rv2gv not fully converting a PV ... 22755 Fix 29149 - another UTF8 cache bug hit by substr. 22774 [perl #28938] split could leave an array without ... 22775 [perl #29127] scalar delete of empty slice returned garbage 22776 [perl #28986] perl -e "open m" crashes Perl 22777 add test for change #22776 ("open m" crashes Perl) 22778 add test for change #22746 ([perl #29102] Crash on assign ... 22781 [perl #29340] Bizarre copy of ARRAY make sure a pad op's ... 22796 [perl #29346] Double warning for int(undef) and abs(undef) ... 22818 BOM-marked and (BOMless) UTF-16 scripts not working 22823 [perl #29581] glob() misses a lot of matches 22827 Smoke [5.9.2] 22818 FAIL(F) MSWin32 WinXP/.Net SP1 (x86/1 cpu) 22830 [perl #29637] Thread creation time is hypersensitive 22831 improve hashing algorithm for ptr tables in perl_clone: ... 22839 [perl #29790] Optimization busted: '@a = "b", sort @a' ... 22850 [PATCH] 'perl -v' fails if local_patches contains code snippets 22852 TEST needs to ignore SCM files 22886 Pod::Find should ignore SCM files and dirs 22888 Remove redundant %SIG assignments from FileCache 23006 [perl #30509] use encoding and "eq" cause memory leak 23074 Segfault using HTML::Entities 23106 Numeric comparison operators mustn't compare addresses of ... 23320 [perl #30066] Memory leak in nested shared data structures ... 23321 [perl #31459] Bug in read() 27722 perlio.c breaks on Solaris/gcc when > 256 FDs are available SPRINTF0 - fixes for sprintf formatting issues - CVE-2005-3962 6663288 Upgrade to CGI.pm 3.33 REGEXP0 - fix for UTF-8 recoding i开发者_开发百科n regexps - CVE-2007-5116 6758953 Perl Sys::Syslog can log messages with wrong severity Built under solaris Compiled at Apr 8 2009 17:12:58 @INC: /usr/perl5/5.8.4/lib/i86pc-solaris-64int /usr/perl5/5.8.4/lib /usr/perl5/site_perl/5.8.4/i86pc-solaris-64int /usr/perl5/site_perl/5.8.4 /usr/perl5/site_perl /usr/perl5/vendor_perl/5.8.4/i86pc-solaris-64int /usr/perl5/vendor_perl/5.8.4 /usr/perl5/vendor_perl .
Solaris
$perl -V Summary of my perl5 (revision 5 version 8 subversion 4) configuration: Platform: osname=solaris, osvers=2.10, archname=i86pc-solaris-64int uname='sunos localhost 5.10 i86pc i386 i86pc' config_args='' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_TS_ERRNO', optimize='-xO3 -xspace -xildoff', cppflags='' ccversion='Sun WorkShop', gccversion='', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags ='' libpth=/lib /usr/lib /usr/ccs/lib libs=-lsocket -lnsl -ldl -lm -lc perllibs=-lsocket -lnsl -ldl -lm -lc libc=/lib/libc.so, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-R /usr/perl5/5.8.4/lib/i86pc-solaris-64int/CORE' cccdlflags='-KPIC', lddlflags='-G' Characteristics of this binary (from libperl): Compile-time options: USE_64_BIT_INT USE_LARGE_FILES Locally applied patches: 22667 The optree builder was looping when constructing the ops ... 22715 Upgrade to FileCache 1.04 22733 Missing copyright in the README. 22746 fix a coredump caused by rv2gv not fully converting a PV ... 22755 Fix 29149 - another UTF8 cache bug hit by substr. 22774 [perl #28938] split could leave an array without ... 22775 [perl #29127] scalar delete of empty slice returned garbage 22776 [perl #28986] perl -e "open m" crashes Perl 22777 add test for change #22776 ("open m" crashes Perl) 22778 add test for change #22746 ([perl #29102] Crash on assign ... 22781 [perl #29340] Bizarre copy of ARRAY make sure a pad op's ... 22796 [perl #29346] Double warning for int(undef) and abs(undef) ... 22818 BOM-marked and (BOMless) UTF-16 scripts not working 22823 [perl #29581] glob() misses a lot of matches 22827 Smoke [5.9.2] 22818 FAIL(F) MSWin32 WinXP/.Net SP1 (x86/1 cpu) 22830 [perl #29637] Thread creation time is hypersensitive 22831 improve hashing algorithm for ptr tables in perl_clone: ... 22839 [perl #29790] Optimization busted: '@a = "b", sort @a' ... 22850 [PATCH] 'perl -v' fails if local_patches contains code snippets 22852 TEST needs to ignore SCM files 22886 Pod::Find should ignore SCM files and dirs 22888 Remove redundant %SIG assignments from FileCache 23006 [perl #30509] use encoding and "eq" cause memory leak 23074 Segfault using HTML::Entities 23106 Numeric comparison operators mustn't compare addresses of ... 23320 [perl #30066] Memory leak in nested shared data structures ... 23321 [perl #31459] Bug in read() 27722 perlio.c breaks on Solaris/gcc when > 256 FDs are available SPRINTF0 - fixes for sprintf formatting issues - CVE-2005-3962 6663288 Upgrade to CGI.pm 3.33 REGEXP0 - fix for UTF-8 recoding in regexps - CVE-2007-5116 Built under solaris Compiled at Jul 31 2008 10:45:31 @INC: /usr/perl5/5.8.4/lib/i86pc-solaris-64int /usr/perl5/5.8.4/lib /usr/perl5/site_perl/5.8.4/i86pc-solaris-64int /usr/perl5/site_perl/5.8.4 /usr/perl5/site_perl /usr/perl5/vendor_perl/5.8.4/i86pc-solaris-64int /usr/perl5/vendor_perl/5.8.4 /usr/perl5/vendor_perl .
Try use utf8; or use encoding "whatever";.
The perls (lowercase perl, I'm talking about the executable here, Perl fans ;) ) could be compiled differently. If you could show us the perl -V
from each of them, we might be able to give you a better answer.
Update:
Well, it looks like a diff shows that the only real difference is the compile date from the perl side of things. If so, the next thing I'd check are the libraries that you are using are the same versions (look for $VERSION = X.X.X
or $VERSION = X.XXX_XXX
or something like that.)
If the libraries you are using are the same, it might be the shell itself; it might be configured to interpret the output using different locales. Try doing a
perl myscript.pl > out.txt
on the solaris box and seeing if that changes anything.
Check for any differences in environment variables.
Check the encoding used by your terminals.
精彩评论