开发者

Solr Search for non-alphaneric characacters

I would like to know the best way to g开发者_如何学Goo about setting up a solr schema to search for something like "#10" within the data.

Thanks.


There's actually quite a bit to your question, which I would break down as such:

  • What data fields do I need to search?
  • How am I going to search those fields?
  • What data do I need to retrieve from a search request?

Your schema design can't really be determined without answering these questions.

Those questions are a much longer topic, so I'm not going to go through those ad nauseum here (read the Solr docs for greater understanding.)

In dealing with special characters, what you care about is the Analysis step in indexing, as you'll want your terms stored in a way that permits you to logically retrieve them. Analyzers (can) use a variety tokenization strategies to apply stemming modifications to indexed content.

Analyzers are about breaking down term text; you'll want to ensure your special characters survive analysis and end up being indexed. I would start looking at the with the WhitespaceAnalyzer, which leaves terms from source content in their exact state in the index. The Solr wiki page on Analyzers will give you an idea of how many of these function.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜