Solr boolean queries combined with index-time boosts
I have a site using Solr 1.4.1 for relevancy/recommendations. I am using boolean-style queries in some places. I am using a query like +(+type:aoh_company +aoh_dictionary_tids:623)
- and that provides the expected results, but the order of the results appear to be arbitrary.
I am trying to control the ranking of the document by setting index-time boosts, but they seem to be ignored for these queries.
An example
- The query URL is
http://localhost:4930/solr/prod/select?rows=5&start=0&q.alt=(type%3Aaoh_company)+(aoh_dictionary_tids%3A623)&q=
- The results are returned in this order (with the index time boost value in parentheses):
- 17132 (1.22)
- 17179 (1.02)
- 17131 (1.10)
- 17133 (1.10)
- 17184 (1.10)
- Obviously, result #2 should not come before #3-5 based on the boost alone.
- Given this is a boolean query, there should not be much difference in ranking.
Debugging output
I tried debugging the query above by appending debugQuery=true
to the query, so it becomes http://localhost:4930/solr/prod/select?rows=5&start=0&q.alt=(type%3Aaoh_company)+(aoh_dictionary_tids%3A623)&q=&debugQuery=true
It's very verbose, but here it is:
<lst name="debug">
<null name="rawquerystring"/>
<null name="querystring"/>
<str name="parsedquery">+(+type:aoh_company +aoh_dictionary_tids:623)</str>
<str name="parsedquery_toString">+(+type:aoh_company +aoh_dictionary_tids:623)</str>
<lst name="explain">
<str name="50hves/node/17132">
1.7819747 = (MATCH) sum of:
0.9014403 = (MATCH) weight(type:aoh_company in 1805), product of:
0.37135038 = queryWeight(type:aoh_company), product of:
2.4274657 = idf(docFreq=457, maxDocs=1909)
0.15297863 = queryNorm
2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1805), product of:
开发者_StackOverflow 1.0 = tf(termFreq(type:aoh_company)=1)
2.4274657 = idf(docFreq=457, maxDocs=1909)
1.0 = fieldNorm(field=type, doc=1805)
0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1805), product of:
0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
6.069428 = idf(docFreq=11, maxDocs=1909)
0.15297863 = queryNorm
0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1805), product of:
1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
6.069428 = idf(docFreq=11, maxDocs=1909)
0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1805)
</str>
<str name="50hves/node/17179">
1.7819747 = (MATCH) sum of:
0.9014403 = (MATCH) weight(type:aoh_company in 1896), product of:
0.37135038 = queryWeight(type:aoh_company), product of:
2.4274657 = idf(docFreq=457, maxDocs=1909)
0.15297863 = queryNorm
2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1896), product of:
1.0 = tf(termFreq(type:aoh_company)=1)
2.4274657 = idf(docFreq=457, maxDocs=1909)
1.0 = fieldNorm(field=type, doc=1896)
0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1896), product of:
0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
6.069428 = idf(docFreq=11, maxDocs=1909)
0.15297863 = queryNorm
0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1896), product of:
1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
6.069428 = idf(docFreq=11, maxDocs=1909)
0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1896)
</str>
<str name="50hves/node/17131">
1.7819747 = (MATCH) sum of:
0.9014403 = (MATCH) weight(type:aoh_company in 1905), product of:
0.37135038 = queryWeight(type:aoh_company), product of:
2.4274657 = idf(docFreq=457, maxDocs=1909)
0.15297863 = queryNorm
2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1905), product of:
1.0 = tf(termFreq(type:aoh_company)=1)
2.4274657 = idf(docFreq=457, maxDocs=1909)
1.0 = fieldNorm(field=type, doc=1905)
0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1905), product of:
0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
6.069428 = idf(docFreq=11, maxDocs=1909)
0.15297863 = queryNorm
0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1905), product of:
1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
6.069428 = idf(docFreq=11, maxDocs=1909)
0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1905)
</str>
<str name="50hves/node/17133">
1.7819747 = (MATCH) sum of:
0.9014403 = (MATCH) weight(type:aoh_company in 1906), product of:
0.37135038 = queryWeight(type:aoh_company), product of:
2.4274657 = idf(docFreq=457, maxDocs=1909)
0.15297863 = queryNorm
2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1906), product of:
1.0 = tf(termFreq(type:aoh_company)=1)
2.4274657 = idf(docFreq=457, maxDocs=1909)
1.0 = fieldNorm(field=type, doc=1906)
0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1906), product of:
0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
6.069428 = idf(docFreq=11, maxDocs=1909)
0.15297863 = queryNorm
0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1906), product of:
1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
6.069428 = idf(docFreq=11, maxDocs=1909)
0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1906)
</str>
<str name="50hves/node/17184">
1.6058679 = (MATCH) sum of:
0.9014403 = (MATCH) weight(type:aoh_company in 1892), product of:
0.37135038 = queryWeight(type:aoh_company), product of:
2.4274657 = idf(docFreq=457, maxDocs=1909)
0.15297863 = queryNorm
2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1892), product of:
1.0 = tf(termFreq(type:aoh_company)=1)
2.4274657 = idf(docFreq=457, maxDocs=1909)
1.0 = fieldNorm(field=type, doc=1892)
0.7044275 = (MATCH) weight(aoh_dictionary_tids:623 in 1892), product of:
0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
6.069428 = idf(docFreq=11, maxDocs=1909)
0.15297863 = queryNorm
0.7586785 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1892), product of:
1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
6.069428 = idf(docFreq=11, maxDocs=1909)
0.125 = fieldNorm(field=aoh_dictionary_tids, doc=1892)
</str>
</lst>
<str name="QParser">DisMaxQParser</str>
<str name="altquerystring">org.apache.lucene.search.BooleanQuery:+type:aoh_company +aoh_dictionary_tids:623</str>
<null name="boostfuncs"/>
<lst name="timing">
<double name="time">7.0</double>
<lst name="prepare">
<double name="time">1.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
<lst name="process">
<double name="time">6.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">6.0</double>
</lst>
</lst>
</lst>
As I read it, the first four results are scored 1.7819747
, and the fifth is scored 1.6058679
, and I can't see the boost values anywhere in there, so it seems that they are not a factor in the ranking equation.
So what am I doing wrong. Is there something I need to do to make Solr take the boosts into consideration?
Is there a way to check the boost value stored in Solr? It looks right in the documents I send to it, but I can't find a way to see the stored value?Additionally, here's the relevant parts from my schema.xml
:
<types>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="integer" class="solr.IntField" omitNorms="true"/>
</types>
<fields>
<field name="type" type="string" indexed="true" stored="true"/>
<field name="aoh_dictionary_tids" type="integer" indexed="true" stored="true" multiValued="true" omitNorms="false"/>
</fields>
In his answer below, fyr mentioned that norms need to be enabled on the field for the boost value to apply. So I'd like to amend my question a bit:
- Is it enough to have norms enabled on one of the queried fields for the boost to apply?
- Does my
omitNorms="false"
on the field override theomitNorms="true"
on the fieldType?
Any help would be greatly appreciated.
You will not see the boost in the explain. Boosting at indexing time is applied to the Norms of a certain field in a certain document. Like a multiplicator.
If you have Norms enabled your bosst value is used at indexing time. Norms are always part of the similarity function if you use the DefaultSimilarity and Norms are enabled.
Edit for the follow up questions:
It is enough to have norms enabled for the boost to apply. Because norms provide the field in the index with a data weight structure in the index. And index time boosts are multiplied on the norm value and saved to the norm field.
omitNorms on the field declaration overrides the type definition - You see this also on your explain structure. aoh_dictionary has a value which does not equal 1. If norms are disabled 1 is as default applied.
精彩评论