Why won't this regex remove final whitespace from Pod::Usage text?
I am working on a module that relies on Pod::Usage to parse the calling script's POD and then send usage, help, and man text to a scalar variable. I needed to remove the final whitespace from that text, so I used a simple regex that I thought would work. And it did ... but intermittently.
Here's a demonstration of the problem. Any insights would be appreciated.
The unexpected behavior (i.e., the failure of the regex to remove final newlines) occurs consistently on my Solaris machine with Perl 5.10.1. Under Windows with Perl 5.12.1, the behavior is erratic (output supplied below).
use strict;
use warnings;
use Pod::Usage qw(pod2usage);
use Test::More;
# Baseline test to show that the regex works.
my $exp = "foo\nbar\n...";
my $with_trailing_whitespace = $exp . " \n\n";
$with_trailing_whitespace =~ s!\s+\Z!!;
my $ords = get_ords_of_final_chars($with_trailing_whitespace);
is_deeply $ords, [46, 46, 46]; # String ends with 3 periods (not whitespace).
# Run a similar test, usi开发者_运维百科ng text from Pod::Usage.
for (1 .. 2){
my $pod = get_pod_text();
$ords = get_ords_of_final_chars($pod);
is_deeply $ords, [46, 46, 46];
}
done_testing();
sub get_ords_of_final_chars {
# Takes a string. Return array ref of the ord() of last 3 characters.
my $s = shift;
return [ map ord(substr $s, - $_, 1), 1 .. 3 ];
}
sub get_pod_text {
# Call pod2usage(), sending output to a scalar.
open(my $fh, '>', \my $txt) or die $!;
pod2usage(-verbose => 2, -exitval => 'NOEXIT', -output => $fh);
close $fh; # This doesn't help.
# Here's the same regex as above.
#
# If I use chomp(), the newlines are consistently removed:
# 1 while chomp($txt);
$txt =~ s!\s+\Z!!;
return $txt;
}
__END__
=head1 NAME
sample - Some script...
=head1 SYNOPSIS
foo.pl ARGS...
=head1 DESCRIPTION
This program will read the given input file(s) and do something
useful with the contents thereof...
=cut
Output on my Windows box:
$ perl demo.pl
ok 1
not ok 2
# Failed test at demo.pl line 18.
# Structures begin differing at:
# $got->[0] = '10'
# $expected->[0] = '46'
not ok 3
# Failed test at demo.pl line 18.
# Structures begin differing at:
# $got->[0] = '10'
# $expected->[0] = '46'
1..3
# Looks like you failed 2 tests of 3.
$ perl demo.pl
ok 1
ok 2
ok 3
1..3
Well, to quote perlre:
\Z Match only at end of string, or before newline at the end
\z Match only at end of string
So, you should be using $txt =~ s!\s+\z!!;
(lower case z
).
Although, since \s+
is greedy, I would have expected it to work anyway. Maybe it's a Perl bug.
While the other posters are correct about \z\Z$, fwiw, I don't get any failures on win32
$ perl -d:Modlist demo.pl
ok 1
ok 2
ok 3
1..3
Carp 1.17
Config
Encode 2.43
Encode::Alias 2.14
Encode::Config 2.05
Encode::Encoding 2.05
Exporter 5.64_01
Exporter::Heavy 5.64_01
File::Spec 3.33
File::Spec::Unix 3.33
File::Spec::Win32 3.33
PerlIO 1.06
PerlIO::scalar 0.08
Pod::Escapes 1.04
Pod::InputObjects 1.31
Pod::Parser 1.37
Pod::Select 1.36
Pod::Simple 3.16
Pod::Simple::BlackBox 3.16
Pod::Simple::LinkSection 3.16
Pod::Text 3.15
Pod::Usage 1.36
Test::Builder 0.98
Test::Builder::Module 0.98
Test::More 0.98
XSLoader 0.15
base 2.15
bytes 1.04
integer 1.00
overload 1.10
vars 1.01
warnings 1.09
warnings::register 1.01
精彩评论