Making part of the regex optional
Here is my regex:
/On.* \d{1,2}\/\d{1,2}\/\d{1,4} \d{1,2}:\d{开发者_运维问答1,2} (?:AM|PM),.*wrote:/
to match:
On 3/14/11 2:55 PM, XXXXX XXXXXX wrote:
I need this Regex to also match:
On 25/03/2011, at 2:19 AM, XXXXX XXXXXXXX wrote:
So I tried this:
/On.* \d{1,2}\/\d{1,2}\/\d{1,4}(, at)? \d{1,2}:\d{1,2} (?:AM|PM),.*wrote:/
But that breaks the other matches
Am I making the (, at)? optional set right?
Thanks
I changed you Regex just slightly, and I am able to match both strings. The regex I have is:
/On.* \d{1,2}\/\d{1,2}\/\d{1,4}(?:, at)? \d{1,2}:\d{1,2} (?:AM|PM),.*wrote:/
Comparing the results of the two:
irb(main):023:0> s1 = "On 25/03/2011, at 2:19 AM, XXXXX XXXXXXXX wrote:"
=> "On 25/03/2011, at 2:19 AM, XXXXX XXXXXXXX wrote:"
irb(main):024:0> s2 = "On 3/14/11 2:55 PM, XXXXX XXXXXX wrote:"
=> "On 3/14/11 2:55 PM, XXXXX XXXXXX wrote:"
#Your previous Regex
irb(main):025:0> m = /On.* \d{1,2}\/\d{1,2}\/\d{1,4}(, at)? \d{1,2}:\d{1,2}(?:AM|PM),.*wrote:/
=> /On.* \d{1,2}\/\d{1,2}\/\d{1,4}(?:, at) \d{1,2}:\d{1,2} (?:AM|PM),.*wrote:/
irb(main):026:0> s1.match(m)
=> #<MatchData "On 25/03/2011, at 2:19 AM, XXXXX XXXXXXXX wrote">
irb(main):027:0> s2.match(m)
=> nil
#The updated Regex
irb(main):028:0> m = /On.* \d{1,2}\/\d{1,2}\/\d{1,4}(?:, at)? \d{1,2}:\d{1,2} (?:AM|PM),.*wrote/
=> /On.* \d{1,2}\/\d{1,2}\/\d{1,4}(?:, at)? \d{1,2}:\d{1,2} (?:AM|PM),.*wrote/
irb(main):029:0> s1.match(m)
=> #<MatchData "On 25/03/2011, at 2:19 AM, XXXXX XXXXXXXX wrote">
irb(main):030:0> s2.match(m)
=> #<MatchData "On 3/14/11 2:55 PM, XXXXX XXXXXX wrote">
The following regex works for both cases:
On\s*\d{1,2}\/\d{1,2}\/\d{1,4}[\s,]*(at)?\s*\d{1,2}:\d{1,2}\s*(?:AM|PM),\s*.*wrote:
The problem with other input strings may be caused by the .*
idiom. It's greedy and want to consume as much as it can from the input.
If your input e.g. is a date, followed by some random text, and then another date -- then your regex will think that the two dates and the random text is one single date. Most of it will be consumed by .*
.
In most cases it's better to use a lazy quantifier. Syntactically you write .*?
instead of .*
. You have two .*
. Try to replace both with .*?
/On.*? \d{1,2}\/\d{1,2}\/\d{1,4}(, at)? \d{1,2}:\d{1,2} (?:AM|PM),.*?wrote:/
If that doesn't work, you'll have to post the failing dates here and you will most certainly get more feedback from this community.
精彩评论