开发者

Keyword proximity matching - options?

I have a case where I have an array of keywords. I want to find their matches within a give开发者_StackOverflow社区n string and return x number of words before and after each.

I could write a looping engine that goes through an array of each, returning a given index, and performing concatenated sub-strings based on those loops, but this seems a bit lengthy.

I've heard of Lucene, but not sure if implementing an entire framework to do this is worth it. Also, if possible, how can I accomplish with Lucene?

Thanks.


Perhaps regular expressions would help... This builds a list of matching strings (up to 3 words before) keyword (up to 3 words after)

Edit: I missed a couple 0s and some @s. Try again.

private static void GetMatches (string s)
{
   string[] keywords = {"if", "while", "do"};
   int x = 3; // words before and after
   string ex =
      @"(\w+\W+){0," + x + @"}\b(" + string.Join("|", keywords) + @")\b\W+(\w+\W+){0," + x + @"}";
   Regex regex = new Regex(ex);
   List<string> matches = new List<string>();
   foreach (Match match in regex.Matches (s))
   {
      matches.Add(match.Value);
   }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜