开发者

How did the sphinx calculate the weight?

Note:

This is a cross-post, it is firstly posted at the sphinx forum,however I got no answer, so I post it here.

First take a look at a example:

The following is my table(just for test used):

+----+--------------------------+----------------------+
| Id | title                    | body                 |
+----+--------------------------+----------------------+
|  1 | National first hospital  | NASA                 |
|  2 | National second hospital | Space Administration |
|  3 | National govenment       | Support the hospital |
+----+--------------------------+----------------------+

I want to search the contents from the title and body field, so I config the sphinx.conf as shown followed:

--------The sphinx config file----------
source mysql
{
        type = mysql
        sql_host = localhost
        sql_user = root
        sql_pass =0000
        sql_db = testfull
        sql_port = 3306 # optional, default is 3306
        sql_query_pre = SET NAMES utf8
        sql_query = SELECT * FROM test
}

index mysql
{
        source = mysql
        path = var/data/mysql_old_test
        docinfo = extern
        mlock = 0
        morphology = stem_en, stem_ru, soundex
        min_stemming_len = 1
        min_word_len = 1
        charset_type = utf-8
        html_strip = 0
}

indexer
{
        mem_limit = 128M
}

searchd
{
    listen = 9312
        read_timeout = 5
        max_children = 30
        max_matches = 1000
        seamless_rotate = 0
        preopen_indexes = 0
        unlink_old = 1
        pid_file = var/log/searchd_mysql.pid
        log = var/log/searchd_mysql.log
        query_log = var/log/query_mysql.log
}
------------------

Then I reindex the db and start the searchd daemon.

In my client side I set the attribute as:

----------Client side config-------------------

sc = new SphinxClient();
///other thing
HashMap<String, Integer> weiMap=new HashMap<String, Integer>();
weiMap.put("title", 100);
weiMap.put("body", 0);
sc.SetFieldWeights(weiMap);

sc.SetMatchMode(SphinxClient.SPH_MATCH_ALL);

sc.SetSortMode(SphinxClient.SPH_SORT_EXTENDED,"@weight DESC");

When I try to search "National hospital", I got the following output:

Query 'National hospital' retrieved 3 of 3 matches in 0.0 sec.
Query stats:
        'nation' found 3 times in 3 documents
        'hospit' found 3 times in 3 documents

Matches:
1. id=3, weight=101
2. id=1, weight=100
3. id=2, weight=100

The match number (three matched) is right,however the order of the result is not what I wanted.

Obviously the document of id 1 and 2 should be the most closed items to the required string( "National hospital" ), so in my opinion they should be given the largest weights,but they are orderd at the last position.

I wonder if there is 开发者_开发百科anyway to meet my requirement?

PS:

1)please do not suggestion me set the sortModel to :

sc.SetSortMode(SphinxClient.SPH_SORT_EXTENDED,"@weight ASC");

This may work for just this example, it will caused some other potinal problems.

2)Actuall the contents in my table is Chinese, I just use the "National Hosp..l" to make a example.


1° You ask "National hospital" but sphinx search "nation" and "hospit" because

 morphology = stem_en, stem_ru, soundex

2° You give weight

 weiMap.put("title", 100);
 weiMap.put("body", 0);

to unexisting text fields

 sql_query = SELECT * FROM test

3° finaly my simple answer to main question

You sort by weight, the third row has more weight because no words between nation and hospit

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜