Need help with a Regex for parsing human typed times
I'm really new to Regex and working hard, but this has gone beyond simple in my opinion. I understand how to create the Regex object in .Net but I'm not sure how to use it for my specific purpose once I have a pattern.
Regex regex = new Regex("(at ){0,1}[0-9]{1,2}(:[0-9]{2}){0,1}(?:[ap]m?){0,1}");
I need to be able to take a sentence like "Dinner will be at 9pm at your favorite restaurant" and get the values { "Dinner will be at your favorite restaurant", "9pm " } (removing "at " if it exists).
Complete(?) test cases:
"Dinner at 9pm" { "Dinner", "9pm" }
"Dinner at9pm" { "Dinner", "9pm" }
"Dinner 9pm" { "Dinner", "9pm" }
"Dinner 9p" { "Dinner", "9pm" }
"Dinner 9a" { "Dinner", "9am" }
"Dinner 9pZ" { "Dinner 9pZ", "" }
"Dinner 9aZ" { "Dinner 9aZ", "" }
"Dinner at 9" { "Dinner", "9" }
"Dinner at 9:15pm" { "Dinner", "9:15pm" }
"Dinner at 9:15" { "Dinner", "9:15" }
"Dinner at9:15" { "Dinner", "9:15" }
"Dinner at 9pm in Seattle" { "Dinner in Seattle", "9pm" }
"Dinner at9pmin Seattle" { "Dinner in Seattle", "9pm" }
"Dinner at9in Seattle" { "Dinner in Seattle", "9" }
"Dinner 9in Seattle" { "Dinner 9in Seattle", "" }
"9pm Dinner" { "Dinner", "9pm" }
"The 9pm Dinner was good" { "The Dinner as good", "9pm" }
"Dinner at 9pmpm" { "Dinner pm" "9pm" }
"Dinner at 9:15pmpm" { "Dinner pm" "9:15pm" }
(just for further clarification, a number without a ":" or "am/pm" must be preceded by "at" unless it is the first number listed. "am" and "pm" require either an ending in "M" or " ".)
Beyond the test cases, I don't understand the syntax needed to get开发者_如何学Go back the values I need using the regex object (list in the brackets above).
A regex for doing this would be complicated and it also wouldn't return the results in the expected order in cases such as "9pm Dinner". If you're willing to spend a little time, it might be simpler to write a basic recursive-descent parser. Each word in the input would form a token, and you can easily come up with rules based on your requirements. For example:
event: "Dinner" time |
"Dinner" location |
"Dinner" time location |
"Dinner" location time
time: "at" number ":" number "am"/"pm"
/* etc. */
You then write a small function for each non-terminal (event, time, location etc.) that will do its part and return the result.
As you see, your requirements already bring up so many possibilities that a regex would only make it extremely confusing, if at all possible.
精彩评论