How can I properly align UTF-8 strings with Perl's printf?
what is the right way to get here a beautiful output ( all lines the same indent )?
#!/usr/bin/env perl
use warnings;
use strict;
use DBI;
my $phone_book = [ [ qw( name number ) ],
[ 'Kroner', 123456789 ],
[ 'Holler', 123456789 ],
[ 'Mühßig', 123456789 ],
[ 'Singer', 123456789 ],
[ 'Maurer', 123456789 ],
];
my $dbh = DBI->connect( "DBI:CSV:", { RaiseError => 1 } );
$dbh->do( qq{ CREATE TEMP TABLE phone_book AS IMPORT( ? ) }, {}, $phone_book );
my $sth = $dbh->prepare( qq{ SELECT name, number FROM phone_book } );
$sth->execute;
my $array_ref = $sth->f开发者_开发技巧etchall_arrayref();
for my $row ( @$array_ref ) {
printf "%9s %10s\n", @$row;
}
# OUTPUT:
# Kroner 123456789
# Holler 123456789
# Mühßig 123456789
# Singer 123456789
# Maurer 123456789
I haven't been able to reproduce it, but loosely speaking what seems to be happening is that it's a character encoding mismatch. Most likely your Perl source file has been saved in UTF-8 encoding. However you have not enabled use utf8;
in the script. So it's interpreting each of the non-ASCII German characters as being two characters and setting the padding accordingly. But the terminal you're running on is also in UTF-8 mode so the characters print correctly. Try adding use warnings;
and I'll bet you get a warning printed, and I would not be surprised if adding use utf8;
actually fixes the problem.
#!/usr/bin/env perl
use warnings;
use strict;
use utf8; # This is to allow utf8 in this program file (as opposed to reading/writing from/to file handles)
binmode( STDOUT, 'utf8:' ); # Allow output of UTF8 to STDOUT
my @strings = ( 'Mühßig', 'Holler' ); # UTF8 in this file, works because of 'use utf8'
foreach my $s (@strings) { printf( "%-15s %10s\n", $s, 'lined up' ); } # should line up nicely
open( FILE, 'utf8file' ) || die("Failed to open file: $! $?");
binmode( FILE, 'utf8:' );
# Same as above, but on the file instead of STDIN
while(<FILE>) { chomp;printf( "%-15s %10s\n", $_, 'lined up' ); }
close( FILE );
# This works too
use Encode;
open( FILE, 'utf8file' ) || die("Failed to open file: $! $?");
while(<FILE>) {
chomp;
$_ = decode_utf8( $_ );
printf( "%-15s %10s\n", $_, 'lined up' );
}
close( FILE );
You can't use Unicode with printf
if you have code points that take 0 or 2 print columns instead of 1, which it appears you do.
You need to use Unicode::GCString instead.
Wrong way:
printf "%-10.10s", our $string;
Right way:
use Unicode::GCString;
my $gcstring = Unicode::GCString->new(our $string);
my $colwidth = $gcstring->columns();
if ($colwidth > 10) {
print $gcstring->substr(0,10);
} else {
print " " x (10 - $colwidth);
print $gcstring;
}
精彩评论