Regular Expression to match a valid day in a date
I need help coming up with a regex to make sure the user enters a valid date
The string will be in the format of mm/dd/yyyy
Here is what I have come up with so far.
/\[1-9]开发者_JS百科|0[1-9]|1[0-2]\/\d{1,2}\/19|20\d\d/
I have validated the regex where the user cannot enter a day higher than 12 and the years have to start with either "19" or "20". What I am having trouble is figuring out some logic for validating the day. The day should not go over 31.
Regex for 0-31:
(0[1-9]|[12]\d|3[01])
Or if you don't want days with a preceding zero (e.g. 05):
([1-9]|[12]\d|3[01])
- As many have noted above, if we want to validate the date as a whole then a RegEx is a very poor choice.
- But if we want to match a pattern of numbers, in this case from
01-31
then RegEx is fine so long as there is some backend logic that validates the date as a whole, if so desired. I see the expected answer currently fails for 10, 20.
- Test:
gawk 'BEGIN{ for(i=0;i<=32;i++){ if (i ~ /^([0-2]?[1-9]|3[01])$/){print i " yes"}else {print i " no"} } }
- This can be corrected as follows:
^([0-2]?[1-9]|3[01]|10|20)$
- Test:
So kindly consider the following solution...
1. Identify the sets that need to be matched:
- Days with prefix "0":
{01,...,09},{10,...,31}
- Sub-set
{10,...,31}
can be split into=> {10,...,29},{30,31}
- Sub-set
- Without any prefix:
{1,...,31} => {1,...,9},{10,...,31}
2. Corresponding regular expressions for each sub-set:
---------------------------------
Sub-Set | Regular-Expression
---------------------------------
{01,...,09} | [0][1-9]
{10,...,29} | [1-2][0-9]
{30,31} | 3[01]
{1,...,9} | [1-9]
---------------------------------
Now we can group ([0][1-9])
and ([1-9])
together as ([0]?[1-9])
. Where ?
signifies 0 or 1 occurrences of the pattern/symbol. [UPDATE] - Thank you @MattFrear for pointing it out.
So the resulting RegEx is: ^(([0]?[1-9])|([1-2][0-9])|(3[01]))$
Tested here: http://regexr.com/?383k1 [UPDATE]
use DateTime;
Other solutions are fine, probably work, etc. Usually, you end up wanting to do a bit more, and then a bit more, and eventually you have some crazy code, and leap years, and why are you doing it yourself again?
DateTime and its formatters are your solution. Use them! Sometimes they are a bit overkill, but often that works out for you down the road.
my $dayFormat = new DateTime::Format::Strptime(pattern => '%d/%m/%Y');
my $foo = $dayFormat->parse_datetime($myDateString);
$foo
is now a DateTime object. Enjoy.
If your date string wasn't properly formatted, $foo
will be "undef"
and $dayFormat->errstr
will tell you why.
^(((((((0?[13578])|(1[02]))[\.\-/]?((0?[1-9])|([12]\d)|(3[01])))|(((0?[469])|(11))[\.\-/]?((0?[1-9])|([12]\d)|(30)))|((0?2)[\.\-/]?((0?[1-9])|(1\d)|(2[0-8]))))[\.\-/]?(((19)|(20))?([\d][\d]))))|((0?2)[\.\-/]?(29)[\.\-/]?(((19)|(20))?(([02468][048])|([13579][26])))))$
From Expressions in category: Dates and Times
Validates the correct number of days in a month, looks like it even handles leap years.
You can of course change [\.\-/]
with /
to only allow slashes.
This isn't all that hard...
qr#^
(?: 0[1-9] | 1[012] )
/
(?:
0[1-9] | 1[0-9] | 2[0-8]
| (?<! 0[2469]/ | 11/ ) 31
| (?<! 02/ ) 30
| (?<! 02/
(?= ...
(?:
.. (?: [02468][1235679] | [13579][01345789] )
| (?: [02468][1235679] | [13579][01345789] ) 00
)
)
) 29
)
/
[0-9]{4}
\z
#x
If you want to check for valid dates, you have to do much more than check numbers and ranges. Fortunately, Perl already has everything you need for this. The Time::Piece module comes with Perl and can parse a date. It knows how to parse dates and do the first round of checks:
use v5.10;
use Time::Piece; # comes with Perl
my @dates = qw(
01/06/2021 01/37/456 10/6/1582 10/18/1988
2/29/1900 2/29/1996 2/29/2000
);
foreach my $date ( @dates ) {
my $t = eval { Time::Piece->strptime( $date, '%m/%d/%Y' ) };
unless( $t ) {
say "Date <$date> is not valid";
next;
}
say $t;
}
The output is interesting and no other solution here is close to handling this. Why is 10/6/1582 an invalid date? It doesn't exist in the Gregorian calendar, but there's a simpler reason here. strptime
doesn't handle dates before 1900.
But also notice that 2/29/1900
gets turned into 3/1/1900
. That's weird and we should fix that, but there's no leap years in years divisible by 100. Well, unless they are divisible by 400, which is why 2/29/2000
works.
Wed Jan 6 00:00:00 2021
Date <01/37/456> is not valid
Date <10/6/1582> is not valid
Tue Oct 18 00:00:00 1988
Thu Mar 1 00:00:00 1900
Thu Feb 29 00:00:00 1996
Tue Feb 29 00:00:00 2000
But let's fix that leap year issue. The tm
struct is going a dumb conversion. If the individual numbers are within a reasonable range (0 to 31 for days) regardless of the month, then it converts those days to seconds and adds them to the offset. That's why 2/29/1900 ends up a day later: that 29 gives the same number of seconds as 3/1/1900. If the date is valid, it should come back the same. And since I'm going to roundtrip this, I fix up the date for leading zeros before I do anything with it:
use v5.10;
use Time::Piece; # comes with Perl
my @dates = qw(
01/06/2021 2/29/1900 2/2/2020
);
foreach my $date ( @dates ) {
state $format = '%m/%d/%Y';
$date =~ s/\b(\d)\b/0$1/g; # add leading zeroes to lone digits
my $t = eval { Time::Piece->strptime( $date, $format ) };
unless( $t ) {
say "Date <$date> is not valid";
next;
}
unless( $t->strftime( $format ) eq $date ) {
say "Round trip failed for <$date>: Got <"
. $t->strftime( $format ) . ">";
next;
};
say $t;
}
Now the output is:
Wed Jan 6 00:00:00 2021
Round trip failed for <02/29/1900>: Got <03/01/1900>
Sun Feb 2 00:00:00 2020
That's all a bit long, but that's why we have subroutines:
if( date_is_valid( $date ) ) { ... }
Still want a regex? Okay, lets use the (??{...})
construct to decide if a pattern should fail. Match a bunch of digits and capture that into $1
. Now, use (??{...})
to make the next part of the pattern, using any Perl code you like. If you accept the capture, return a null pattern. If you reject it, return the pattern (*FAIL)
, which immediately causes the whole match to fail. No more tricky alternations. And this one uses the new chained comparison in v5.32 (although I still have misgivings about it):
use v5.32;
foreach ( qw(-1 0 1 37 31 5 ) ) {
if( /\A(\d+)(??{ (1 <= $1 <= 31) ? '' : '(*FAIL)' })\z/ ) {
say "Value <$1> is between 1 and 31";
}
}
Try it:
/(0?[1-9]|1[012])\/(0?[1-9]|[12][0-9]|3[01])\/((19|20)\d\d)/
Is regular expression a must? If not, you better off using a different approach, such as DateTime::Format::DateManip
my @dates = (
'04/23/2009',
'01/22/2010 6pm',
'adasdas',
'1010101/12312312/1232132'
);
for my $date ( @dates )
{
my $date = DateTime::Format::DateManip->parse_datetime( $date );
die "Bad Date $date" unless (defined $date);
print "$dt\n";
}
Regex for 0-31 day:
0[1-9]|[12]\d|3[01]) without prefix 0 - when "1", "23"...
([1-9]|[12]\d|3[01]) with prefix 0 - when "01", "04"
(0?[1-9]|[12]\d|3[01]) - with or without "0" - when ""
Simpler regex:
([12]\d|3[01]|0?[1-9])
Consider the accepted answer and this expression:
(0[1-9]|[12]\d|3[01])
This matches 01 but not 1
The other expression in the accepted answer:
([1-9]|[12]\d|3[01])
This matches 1 but not 01
It is not possible to add an OR clause to get them both working.
The one I suggested matches both. Hope this helps.
I have been working with this some time and the best regex I've came up with is the following:
\b(0)?(?(1)[1-9]|(3)?(?(2)[01]|[12][0-9]))\b|\b[1-9]\b
It will match the following numbers:
1 01 10 15 20 25 30 31
It does not match the following:
32 40 50 90
精彩评论