Possible to rank partial matches in Postgres full text search?

2022-12-20 23:17 问答作者：

I'm trying to calculate a ts_rank for a full-text match where some of the terms in the query may not be in the ts_vector against which it is being matched. I would like the rank to be higher in a match where more words match. Seems pretty simple?

Because not all of the terms have to match, I have to | the operands, to give a query such as to_tsquery('one|two|three') (if it was &, all would have to match).

The problem is, the rank value seems to be the same no matter how many words match. In other words, it's maxing rather than multiplying the clauses.

select ts_rank('one two three'::tsvector, to_tsquery('one')); gives 0.0607927.

select ts_rank('one two three'::tsvector, to_tsquery('one|two|three|four')); gives the expected lower value of 0.0455945 because 'four' is not the vector.

But select ts_rank('one two three'::tsvector, to_tsquery('one|two'));

gives 0.0607927 and likewise

select ts_rank('one two three'::tsvector, to_tsquery('one|two|three'));

gives 0.0607927

I would like the result of ts_rank to be higher if more terms match.

Possible?

To counter one possible response: I cannot calculate all possible subsequences of the search query as intersections and then union them all in a query 开发者_Python百科because I am going to be working with large queries. I'm sure there are plenty of arguments against this anyway!

Edit: I'm aware of ts_rank_cd but it does not solve the above problem.

Use the smlar extension (linux only AFAIK, written by the same guys that brought us text search).

It has functions for calculating TFIDF, cosine, or overlap similarity between arrays. It supports indexing so is fast.

Another way would be to "spell-check" the query prior to using it, basically removing any query terms that are not in your corpus.

The conclusion that I have come to is to & the items together for the ranking. In my select query (with which I'm doing the search) the items are |ed. This seems to work.

继续阅读：full-text-search postgresql

Possible to rank partial matches in Postgres full text search?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？