In too deep --#-- Constructing a regex that accepts any valid date, while rejecting invalid ones like 2/29/1999, is very tough indeed. It's so tough that when I needed to do it, I nearly threw up my hands. Then I went looking for a good date regex.

The closest I came was a partial formula in Friedl's book, where he teases you with possible solutions to matching the day values in a 31-day month.

The key, really, is reducing the problem to alterations among every possible way to express a date. There aren't that many, really!

At any rate, after about 10 hours of noodling, I came up with the following regex for validating a date.

$date =~ m%^(0?[1-9]|1[0-2])/([12][0-9]|3[01]|0?[1-9])/(19|20)(\d\d)$%;
The only problem with it is, it allows dates like 2/29/1999. I actually implemented this in a Java servlet using a library called Perl Tools. But when someone wanted to require that the date be absolutely valid, I supplemented it by instanciating a Java Calendar and checking to see if an exception were thrown.

I know, I know. Why not just skip the regex? That's a perfectly reasonable approach. But by this point I was in too deep. Another 10 hours later, I'd noodled the solution. (The regex itself should all be one line, but it wouldn't fit in this format. I've marked the line continuations by //.)

$year = substr($date, length($date) -2, 2); if (($year % 4) != 0) { if ($date =~ m%(^(0?[13578]|1[02])/([12][0-9]|3[01]|0?[1-9])/(19|20)(\d\d)$) // |(^(0?[469]|11)/([12][0-9]|30|0?[1-9])/(19|20)(\d\d)$) // |(^(0?2)/(1[0-9]|2[0-8]|0?[1-9])/(19|20)(\d\d)$)%) { print "OK!\n"; } else { print "$not OK\n"; } } else { if ($date =~ m%(^(0?[13578]|1[02])/([12][0-9]|3[01]|0?[1-9])/(19|20)(\d\d)$) // |(^(0?[469]|11)/([12][0-9]|30|0?[1-9])/(19|20)(\d\d)$) // |(^(0?2)/(1[0-9]|2[0-9]|0?[1-9])/(19|20)(\d\d)$)%) { print "OK!\n"; } else { print "not OK\n"; } }
The artificial line breaks also show the basic logic here: 31-day months (1,3,5,7,8 or 10,12); 30-day months (4,6,9, or 11) and February.

Sure, it's not purely a regex. It uses arithmatic to check for leap year. So what? Now, that leap year check could be implemented as a regex. Why don't you show me? Oh, and by the way, it is inaccurate on one date between 1/1/1900 and 12/31/2099. Can you guess which one?