How did the sphinx calculate the weight?
Note: This is a cross-post, it is firstly posted at the sphinx forum,however I got no answer, so I post it here.
First take a look at a example:
The following is my table(just for test used):
+----+--------------------------+----------------------+ | Id | title | body | +----+--------------------------+----------------------+ | 1 | National first hospital | NASA | | 2 | National second hospital | Space Administration | | 3 | National govenment | Support the hospital | +----+--------------------------+----------------------+
I want to search the contents from the title and body field, so I config the sphinx.conf as shown followed:
--------The sphinx config file---------- source mysql { type = mysql sql_host = localhost sql_user = root sql_pass =0000 sql_db = testfull sql_port = 3306 # optional, default is 3306 sql_query_pre = SET NAMES utf8 sql_query = SELECT * FROM test } index mysql { source = mysql path = var/data/mysql_old_test docinfo = extern mlock = 0 morphology = stem_en, stem_ru, soundex min_stemming_len = 1 min_word_len = 1 charset_type = utf-8 html_strip = 0 } indexer { mem_limit = 128M } searchd { listen = 9312 read_timeout = 5 max_children = 30 max_matches = 1000 seamless_rotate = 0 preopen_indexes = 0 unlink_old = 1 pid_file = var/log/searchd_mysql.pid log = var/log/searchd_mysql.log query_log = var/log/query_mysql.log } ------------------
Then I reindex the db and start the searchd daemon.
In my client side I set the attribute as:
----------Client side config-------------------
sc = new SphinxClient();
///other thing
HashMap<String, Integer> weiMap=new HashMap<String, Integer>();
weiMap.put("title", 100);
weiMap.put("body", 0);
sc.SetFieldWeights(weiMap);
sc.SetMatchMode(SphinxClient.SPH_MATCH_ALL);
sc.SetSortMode(SphinxClient.SPH_SORT_EXTENDED,"@weight DESC");
When I try to search "National hospital", I got the following output:
Query 'National hospital' retrieved 3 of 3 matches in 0.0 sec. Query stats: 'nation' found 3 times in 3 documents 'hospit' found 3 times in 3 documents Matches: 1. id=3, weight=101 2. id=1, weight=100 3. id=2, weight=100
The match number (three matched) is right,however the order of the result is not what I wanted.
Obviously the document of id 1 and 2 should be the most closed items to the required string( "National hospital" ), so in my opinion they should be given the largest weights,but they are orderd at the last position.
I wonder if there is 开发者_开发百科anyway to meet my requirement?
PS:
1)please do not suggestion me set the sortModel to :
sc.SetSortMode(SphinxClient.SPH_SORT_EXTENDED,"@weight ASC");
This may work for just this example, it will caused some other potinal problems.
2)Actuall the contents in my table is Chinese, I just use the "National Hosp..l" to make a example.
1° You ask "National hospital" but sphinx search "nation" and "hospit" because
morphology = stem_en, stem_ru, soundex
2° You give weight
weiMap.put("title", 100);
weiMap.put("body", 0);
to unexisting text fields
sql_query = SELECT * FROM test
3° finaly my simple answer to main question
You sort by weight, the third row has more weight because no words between nation and hospit
精彩评论