Lucene Analyzer to Use With Special Characters and Punctuation?

2022-12-28 08:14 问答作者：

I have a Lucene index that has several documents in it. Each document has multiple fields such as:

Id
Project
Name
Description

The Id field will be a unique identifier such as a GUID, Project is a user's ProjectID and a user can only view documents for their project, and Name and Description contain text that can have special characters.

When a user performs a search on the Name field, I want to be able to attempt to match the best I can such as:

First

Will return both:

First.Last

and

First.Middle.Last

Name can also be something like:

Test (NameTest)

Where, if a user types in 'Test', 'Name', or '(NameTest)', then they can find the result.

However, if I say that Project is 'ProjectA' then that needs to be an exact match (case insensitive search). The same goes with the Id field.

Which fields s开发者_StackOverflowhould I set up as Tokenized and which as Untokenized? Also, is there a good Analyzer I should consider to make this happen?

I am stuck trying to decide the best route to implement the desired searching.

Your ID field should be untokenized for simple reason it does not appear it can be tokenized (whitespace based) unless you write your own tokenizer. You can Tokenize all your other fields.

Perform a phrase query on the project name, look up PhraseQuery or enclose your project name in double quotes (which will make it match exactly). Example: "\"My Fancy Project"\"

For the name field a simple query should work fine.

Unsure if there are situations where you want a combination of fields. In that situation look up BooleanQuery (which allows you to combine different queries boolean-ly)

继续阅读：indexing lucene lucene.net

Lucene Analyzer to Use With Special Characters and Punctuation?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？