开发者

Apache Solr More Like This: How does the mlt.minwl parameter affect query results

The Apache Solr wiki states that mlt.minwl = minimum word length below which words will be ignored.

A concrete example of this is appreciated.

Example query (decoded)

(string:822) qt=mlt&fl=nid%2Ctitle%2Cpath%2Curl%2Css_simple_geo_position%2Cis_cck_field_sponsored_content_yn%2Cis_cck_field_compound_review_yn%2Cis_workflow_state%2Cds_cck_field_publish_date%2Cds_cck_field_publish_expiration_date%2Csm_timeout_search_event_type%2Csm_timeout_search_event_genre%2Csm_timeout_search_venue_type%2Csm_timeout_search_venue_feature%2Csm_timeout_search_venue_genre&mlt.fl=body%2Cname%2Ctaxonomy_names%2Ctitle&mlt.mintf=1&mlt.mindf=1&mlt.minwl=3&mlt.maxwl=15&mlt.maxqt=20&fq%5B0%5D=-is_cck_field_exclude_from_search%3A1&fq%5B1%5D=-type%3Aimage&fq%5B2%5D=-type%3Aoccurrence&fq%5B3%5D=%28nodeaccess_f98a254002c9_all%3A0+OR+nodeaccess_f98a254002c9_workflow_access%3A1+OR+nodeaccess_f98a254002c9_workflow_access_owner%3A0+OR+nodeaccess_all%3A0%29&facet.limit=21&version=1.2&wt=json&json.nl=map&q=milk&开发者_开发知识库start=0&rows=4


If an "interesting term" (alike), found by MLT, has less than mlt.minwl number of characters it will be ignored (excluded from MLT results).
Default value of this param is 0, which means that the param has no effect.

The essences of the internal workings of MLT operate like this:

  1. Gather all of the terms with frequency information from the input document:
  2. If the input document is a reference to a document within the index, then loop over the fields listed in mlt.fl, and then the term information needed is readily there for the taking, if the field has the termVectors enabled. Otherwise get the stored text, and re-analyze it to derive the terms. If the input document is posted as text to the handler, then analyze it to derive the terms. The analysis used is that configured for the first field listed in mlt.fl.
  3. Filter the "interesting terms" based on configured thresholds, one of which is your mlt.minwl param.
  4. Construct a query with these interesting terms across all of the fields listed in mlt.fl.
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜