find all occurrences of 'can be interpreted as time'
Is there an efficient way to search a message for substrings which might represent a time?
For example, this message:
let's meet tomorrow at 14:30 or do you prefer 2:30pm?
should return ('14开发者_JAVA百科:30', '2:30pm')
. Finding hh:mm times can be easily achieved using a simple regex, but I'm wondering if there are existing solutions to find more than the simple cases.
Here's a regex I came up with:
^((\d{1,2}:\d{2}\s?([ap]m?)?)|(\d{1,2}\s?[ap]m?))$
It matches:
2:10
14:20
10:00am
3:49p
4pm
10a
But not:
12
22:342
14:0
20rpm
As seen on rubular
I think it would be just too difficult for it to be much smarter than this. For example, "I have 2 classes after 2 tomorrow" you can't expect a program to correctly identify which numbers can be interpreted as time unless it's able to understand semantics - but that's a whole different story
PS: The regex also matches string like 99:99 am, which can be fixed but would make the regex even more confusing and just not worth to fix IMO.
精彩评论