开发者

Regular expression help for a date (python+

im trying to create an expression that matches 11.11.11 but not 111.11.111 i'm using python

keyword = re.compile(r"[0-9]*[0-9]\.[0-9]*[0-9]\.[0-9]*[0-9]")

the date could be at the start/end of a sentence and not have a white space but a next line before/after. how would i account for both ? as it is this will pick up up 11.11.11开发者_开发技巧 but also 111.11.11111 etc :(


* means "zero or more of the preceding token". Therefore your regex will match anything from 1.1.1 to 999999.999999.99999 etc.

You can be more specific like this:

keyword = re.compile(r"\b[0-9]{2}\.[0-9]{2}\.[0-9]{2}\b")

The \b word boundary anchors make sure that the numbers start/end at that position. Otherwise you could pick up substring matches (matching 34.56.78 in the string 1234.56.7890, for example).

Of course, you'll need to validate whether it's actually a plausible date separately. Don't use regexes for this (it's possible but cumbersome), rather use the datetime module's strptime() classmethod.


You can use \b to match a word boundary. For example, you could make your regular expression:

 re.compile(r'\b\d{2}\.\d{2}\.\d{2}\b')

I've also used \d to match any digit and the {2} suffix to match two instances of whatever came previously. If you want to match either 1 or 2 digits in any of those cases, you could change the {2} to {1,2}.


Try using ? instead of * as a wildcard.

The ? matches 0 or 1 instances of the previous element. In other words, it makes the element optional; it can be present, but it doesn't have to be.

This will match both 1.1.1 and 11.11.11, but not 1111.1111.1111:

keyword = re.compile(r"\b[0-9]?[0-9]\.[0-9]?[0-9]\.[0-9]?[0-9]\b")
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜