What does the Perl split function return when there is no value between tokens?
I'm trying to split a string using the split function but there isn't always a value between tokens.
Ex: ABC,123,,,,,,XYZ
I don't want to skip the multiple tokens though. These values are in specific positions in the string. However, when I do a split, and then try to step through my resulting array, I get "Use of uninitialized value" warnings.
I've tried comparing the value using $splitvalues[x] eq ""
and I've tried using defined($splitvalues[x])
, but I c开发者_如何学Pythonan't for the life of me figure out how to identify what the split function is putting in to my array when there is no value between tokens.
Here's the snippet of my code (now with more crunchy goodness):
my @matrixDetail = ();
#some other processing happens here that is based on matching data from the
#@oldDetail array with the first field of the @matrixLine array. If it does
#match, then I do the split
if($IHaveAMatch)
{
@matrixDetail = split(',', $matrixLine[1]);
}
else
{
@matrixDetail = ('','','','','','','');
}
my $newDetailString =
(($matrixDetail[0] eq '') ? $oldDetail[0] : $matrixDetail[0])
. (($matrixDetail[1] eq '') ? $oldDetail[1] : $matrixDetail[1])
.
.
.
. (($matrixDetail[6] eq '') ? $oldDetail[6] : $matrixDetail[6]);
because this is just snippets, I've left some of the other logic out, but the if statement is inside a sub that technically returns the @matrixDetail array back. If I don't find a match in my matrix and set the array equal to the array of empty strings manually, then I get no warnings. It's only when the split populates the @matrixDetail.
Also, I should mention, I've been writing code for nearly 15 years, but only very recently have I needed to work with Perl. The logic in my script is sound (or at least, it works), I'm just being anal about cleaning up my warnings and trying to figure out this little nuance.
#!perl
use warnings;
use strict;
use Data::Dumper;
my $str = "ABC,123,,,,,,XYZ";
my @elems = split ',', $str;
print Dumper \@elems;
This gives:
$VAR1 = [
'ABC',
'123',
'',
'',
'',
'',
'',
'XYZ'
];
It puts in an empty string.
Edit: Note that the documentation for split()
states that "by default, empty leading fields are preserved, and empty trailing ones are deleted." Thus, if your string is ABC,123,,,,,,XYZ,,,,
then your returned list will be the same as the above example, but if your string is ,,,,ABC,123
, then you will have a list with three empty strings in elements 0, 1, and 2 (in addition to 'ABC'
and '123'
).
Edit 2: Try dumping out the @matrixDetail
and @oldDetail
arrays. It's likely that one of those isn't the length that you think it is. You might also consider checking the number of elements in those two lists before trying to use them to make sure you have as many elements as you're expecting.
I suggest to use Text::CSV from CPAN. It is a ready made solution which already covers all the weird edge cases of parsing CSV formatted files.
delims with nothing between them give empty strings when split. Empty strings evaluate as false in boolean context.
If you know that your "details" input will never contain "0" (or other scalar that evaluates to false), this should work:
my @matrixDetail = split(',', $matrixLine[1]);
die if @matrixDetail > @oldDetail;
my $newDetailString = "";
for my $i (0..$#oldDetail) {
$newDetailString .= $matrixDetail[$i] || $oldDetail[$i]; # thanks canSpice
}
say $newDetailString;
(there are probably other scalars besides empty string and zero that evaluate to false but I couldn't name them off the top of my head.)
TMTOWTDI:
$matrixDetail[$_] ||= $oldDetail[$_] for 0..$#oldDetail;
my $newDetailString = join("", @matrixDetail);
edit: for loops now go from 0 to $#oldDetail
instead of $#matrixDetail
since trailing ",,," are not returned by split.
edit2: if you can't be sure that real input won't evaluate as false, you could always just test the length of your split elements. This is safer, definitely, though perhaps less elegant ^_^
Empty fields in the middle will be ''. Empty fields on the end will be omitted, unless you specify a third parameter to split large enough (or -1 for all).
精彩评论