开发者

find all occurrences of 'can be interpreted as time'

Is there an efficient way to search a message for substrings which might represent a time?

For example, this message:

let's meet tomorrow at 14:30 or do you prefer 2:30pm?

should return ('14开发者_JAVA百科:30', '2:30pm'). Finding hh:mm times can be easily achieved using a simple regex, but I'm wondering if there are existing solutions to find more than the simple cases.


Here's a regex I came up with:

^((\d{1,2}:\d{2}\s?([ap]m?)?)|(\d{1,2}\s?[ap]m?))$

It matches:

2:10
14:20
10:00am
3:49p
4pm
10a 

But not:

12
22:342
14:0
20rpm

As seen on rubular

I think it would be just too difficult for it to be much smarter than this. For example, "I have 2 classes after 2 tomorrow" you can't expect a program to correctly identify which numbers can be interpreted as time unless it's able to understand semantics - but that's a whole different story

PS: The regex also matches string like 99:99 am, which can be fixed but would make the regex even more confusing and just not worth to fix IMO.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜