How to generate non-prefix autocomplete suggestions?

2023-01-04 13:30 问答作者：

I would like to add autocomplete to my tagging functionality.

A couple of questions:

How do I generate a list of autocomplete suggestions that includes matches in both the prefix and the middle of string? For example, if the user type "auto", the autocomplete suggestions should include terms such as "autocomplete" and "build automation".
I would like to allow multi-word tags and use comma (",") as a separator开发者_高级运维 for tags. So when the use hits the space bar, he is still typing out the same tag, but when he hits the comma key, he's starting a new tag. How do I do that?

I am using Django, jQuery, MySQL, and Solr. What is the best way to achieve to implement the above 2 features?

I've implemented exactly what you're asking about and it works really well. There's two issues to be aware of:

Highlighting in the results list summaries doesn't work, and the suggested workaround also doesn't work in this particular case.
If your documents have long titles and truncate them when displayed, there's a chance you'll be matching on the prefix of a word that's not being displayed. Several ways to handle this of course.
And in a future version, I'd like to give words towards the start of the title a bit more weight then words at the end. This would be one way to mitigate the previous item.

Like the previous answer, I'd start with the same article linked above, but you DO want the Edge NGram analyzer. The thing you'll add is to ALSO do whitespace tokenization.

And then you'd make these changes to your schema.xml file. This example assumes you already have a field called "title" defined, and it's what you'd like to display as well. I create a second field, which is ONLY used for autocomplete prefix matching.

Step 1: Define Edge NGram Text field type

<types>
  <!-- ... other types ... -->

  <!-- Assuming you already have this -->
  <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
    ... normal text definition ...
  </fieldType>

  <!-- Adding this -->
  <fieldType name="prefix_edge_text" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <!-- not using enablePositionIncrements="true" for now -->
      <filter class="solr.StopFilterFactory" words="stopwords.txt" />
      <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" />
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <!-- No need to create Edges here -->
      <!-- Don't want stopwords here -->
    </analyzer>
  </fieldType>

</types>

Step 2: Define the New Field

<fields>
  <!-- ... other fields ... -->

  <!-- Assuming you already have this -->
  <field name="title" type="text" indexed="true" stored="true" multiValued="true"/>

  <!-- Adding this -->
  <field name="prefix_title" type="prefix_edge_text" indexed="true" stored="true" multiValued="true" />

</fields>

Step 3: Copy the Title's content over to the prefix field during indexing

<!-- Adding this -->
<copyField source="title" dest="prefix_title" />

That's pretty much it for the schema. Just remember:

When you do a regular search, you still search against the regular title field.
When you're doing an autocomplete search, search against the prefix_title.

Use the NGramTokenizerFactory. Use the analysis console to see how it works. Also see this article (but you would use NGram instead of EdgeNGram).
Not sure what you mean by "tags" but I guess you have a multivalued field "tags", so your code would parse the input (splitting by ",") before sending the data to Solr.

继续阅读：autocomplete jquery-autocomplete solr

How to generate non-prefix autocomplete suggestions?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？