How do I setup Lucene so that I can search ignoring whitespace characters?
For example, a list of part开发者_JAVA百科 numbers includes:
JRB-1000
JRB 1000 JRB1000 JRB100-0 -JRB1000If a user searches on 'JRB1000', or 'JRB 1000' I would like to return a match for all the part numbers above.
Write a custom Analyzer that either splits these into several tokens (JRB, 1000; relatively easy and forgiving to users) or concatenates them into a single token (JRB1000; hard but precise). Implementing your own Analyzer amounts to overriding the tokenStream argument in an existing one and perhaps writing a custom TokenFilter class.
Apply your new Analyzer on both documents being indexed and queries.
(Links are for the Java version, but .NET should be similar.)
加载中,请稍侯......
精彩评论