Keyword proximity matching - options?
I have a case where I have an array of keywords. I want to find their matches within a give开发者_StackOverflow社区n string and return x number of words before and after each.
I could write a looping engine that goes through an array of each, returning a given index, and performing concatenated sub-strings based on those loops, but this seems a bit lengthy.
I've heard of Lucene, but not sure if implementing an entire framework to do this is worth it. Also, if possible, how can I accomplish with Lucene?
Thanks.
Perhaps regular expressions would help... This builds a list of matching strings (up to 3 words before) keyword (up to 3 words after)
Edit: I missed a couple 0s and some @s. Try again.
private static void GetMatches (string s)
{
string[] keywords = {"if", "while", "do"};
int x = 3; // words before and after
string ex =
@"(\w+\W+){0," + x + @"}\b(" + string.Join("|", keywords) + @")\b\W+(\w+\W+){0," + x + @"}";
Regex regex = new Regex(ex);
List<string> matches = new List<string>();
foreach (Match match in regex.Matches (s))
{
matches.Add(match.Value);
}
}
精彩评论