Using String Masks in Perl
I have a program that allows a user to specify a mask such as MM-DD-YYYY, and compare it to a string. In the string, the MM will be assumed to be a month, DD will be the day of the month, and YYYY will be the year. Everything else must match exactly:
- String: 12/31/2010 Mask MM-DD-YYYY: Fail: Must use slashes and not dashes
- String: 12/31/2010 Mask DD/MM/YYYY: Fail: Month must be second and there's no month 31.
- String: 12/31-11 Mask: MM/DD-YY: Pass: String matches mask.
Right now, I use index
and substr
to pull out the month, day, and year, then I use xor
to generate a mask for everything else. It seems a bit inelegant, and I was wondering if there's a better way of doing this:
my $self = shift;
my $date = shift;
my $format = $self->Format();
my $month;
my $year;
my $day;
my $monthIndex;
my $yearIndex;
my $dayIndex;
#
# Pull out Month, Day, and Year
#
if (($monthIndex = index($format, "MM")) != -1) {
$month = substr($date, $monthIndex, 2);
}
if (($dayIndex = index($format, "DD")) != -1) {
$day = substr($date, $dayIndex, 2);
}
if (($yearIndex = index($format, "YYYY")) != -1) {
$year = substr($date, $yearIndex, 4);
}
elsif (($yearIndex = index($format, "YY")) != -1) {
$year = substr($date, $yearIndex, 2);
if ($year < 50) {
$year += 2000;
}
else {
$year += 1900;
}
}
#
# Validate the Rest of Format
#
(my $restOfFormat = $format) =~ s/[MDY]/./g; #Month Day and Year can be anything
if ($date !~ /^$restOfFormat$/) {
return; #Does not match format
}
[...More Stuff before I return a true value...]
I'm doing this for a date, time (using HH, MM, SS, and A/*AA*), and IP addresses in my code.
BTW, I tried using regular expressions to pull the date from the string, but it's even messier:
#-----------------------------------------------------------------------
# FIND MONTH
#
my $mask = "M" x length($format); #All M's the length of format string
my $monthMask = ($format ^ $mask); #Bytes w/ "M" will be "NULL"
$monthMask =~ s/\x00/\xFF/g; #Change Null bytes to "FF"
$monthMask =~ s/[^\xFF]/\x00/g; #Null out other bytes
#
# ####Mask created! Apply mask to Date String
#
$month = ($monthMask & $date); #Nulls or Month Value
$month =~ s/\x00//g; #Remove Null bytes from string
#
#-----------------------------------------------------------------------
It's a neat programming trick, but it was pretty hard to u开发者_开发知识库nderstand exactly what I was doing and thus would make it hard for someone else to maintain.
Another option could be to rewrite your pattern into strftime/strptime pattern and test with those functions. I am using versions included in core Time::Piece module.
use Time::Piece;
test('12/31/2010' => 'MM-DD-YYYY');
test('12/31/2010' => 'DD/MM/YYYY');
test('12/31-11' => 'MM/DD-YY');
sub test {
my ($time, $mask) = @_;
my $t = eval { Time::Piece->strptime($time, make_format_from($mask)) };
print "String: $time Mask: $mask "
. (defined $t ? "Pass: ".$t->ymd : "Fail"), "\n";
}
sub make_format_from {
my $mask = shift;
for($mask) {
s/YYYY/%Y/;
s/YY/%y/;
s/MM/%m/;
s/DD/%d/;
}
return $mask;
}
This code yields
String: 12/31/2010 Mask: MM-DD-YYYY Fail
String: 12/31/2010 Mask: DD/MM/YYYY Fail
String: 12/31-11 Mask: MM/DD-YY Pass: 2011-12-31
You are already using some regular expressions in this code. Why not convert the user's mask into a pattern and use a regular expression to validate the input directly? Say,
$mask =~ s/YYYY/\\d{4}/;
# or: $mask =~ s/YYYY/[12][0-9]{3}/
$mask =~ s/MM/(0[1-9]|1[0-2])/; # MM => 01 - 12
$mask =~ s/DD/(0[1-9]|[12][0-9]|3[01])/; # DD => 01 - 31
$mask =~ s/YY/\\d{2}/; # YY => 00 - 99
$mask = '^' . $mask . '$';
So for example, this would compile the user mask MM-DD-YY
into the pattern
^(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])-\d{2}$
, which you could test with:
if ($input =~ qr/$mask/) {
print "Input is valid\n";
} else {
print "Input is invalid\n";
}
You can simplify by using regular expressions:
For MM/DD-YY
:
die "Wrong format" unless $date =~ /([01][0-9])\/([0-3][0-9])-([0-9][0-9])/;
If it matches, the parentheses capture the different parts, and can be referred to as $1
, $2
etc.. Then use those variables for further testing, e.g. if month is between [1,12].
Btw., this pattern is not y2k compatible...
精彩评论