开发者

Can anyone recommend a method to perform the following string operation using C#

Suppose I hav开发者_开发知识库e a string:

"my event happened in New York on Broadway in 1976"

I have many such strings, but the locations and dates vary. For example:

"my event happened in Boston on 2nd Street in 1998" "my event happened in Ann Arbor on Washtenaw in 1968"

so the general form is: "my event happened in X on Y in Z"

I would like to parse the string to extract X, Y and Z

I could use Split and use the sentinel words "in", "on" to delimit the token I want but this seems clunky. But using a full parser/lexer like grammatica seems heavyweight.

Recommendations would be gratefully accepted.

Is there a "simple" parser lexer for C#?


KISS applies here. Just do the String.Split solution, or use String.IndexOf to find the "in" and "out" (frankly, String.Split is the simplest). You don't need anything more complicated for such a simple "grammar"; note in particular that regex is overkill here.


Try using regex pattern matching. Here's an MSDN link that should be pretty helpful: http://support.microsoft.com/kb/308252


An example might help. Note that a regex solution gives you scope to accept more variants as and when you see them. I reject the idea that RegEx is overkill, by the way. I'm no expert but it's so easy to do stuff like this I do wonder why it's not used more frequently.

var regEx = new Regex(
        "(?<intro>.+) in (?<city>.+) on (?<locality>.+) in (?<eventDate>.+)"
        );

var match = regEx.Match("My event happens in Baltimore on Main Street in 1876.");

if (!match.Success) return;
foreach (var group in new[] {"intro", "city", "locality", "eventDate"})
{
    Console.WriteLine(group + ":" + match.Groups[group]);
}

Finally, if performance is a real worry (though ignore this if it isn't), look here for optimisation tips.


If you are sure that the string is always going to be in that format then you can do as you have already figured out by splitting by words "in" and then by "on".

To be sure you would like to then search for the Found words in a Database of City names and Year for Validity of search.


If string may not be in that format always then you can do is Search for the whole string for Words and match them against database of City names and Years and check them for Validity.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜