开发者

Lucene ignore keywords in search term

This seems like it should be simple, but I can't figure out how to get Lucene to ignore the AND, OR, and NOT keywords - the query parser throws a parse error when it gets one. I have a query builder class that splits the search term so that it searches on the words themselves as well as on n-grams in the word. I'm using Lucene in Java.

So in a search for, say, "ANDERSON COOPER" the query string looks like:

name: (ANDERSON COOPER "ANDERSON COOPER")^5 gram4: ( ANDE NDER DERS ERSO RSON 
SONC ONCO NCOO COOP OOPE OPER)

the query parser throws an error when it gets those ANDs. Ideally, I'd like the parser to just ignore AND, OR, NOT altogether, and I'll use the &&, ||, and ! equivalents if I need them - do I have to modify the code in the QueryParser class itself to get this? Or is there an easier way? I could also just insert an escape character for these cas开发者_开发技巧es if that is the best way to do it, but adding \ before the word AND doesn't seem to do anything.


You can wrap the AND in quotes like this: "AND". Is that easy? A regex could probably do that easily if you know exactly what your queries look like.

The parser shouldn't have a problem with it, and the PhraseQuery will be rewritten as a term query, so it will be a small constant-time performance difference big-oh O(1).

The regex could probably look like this:

\b(AND|OR|NOT)\b

Which would be replaced with

"$1"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜