开发者

How to find similar stories based on keyword matching?

I have a table, story_keywords which contain keywords matched up to a specific story with the following columns: id, story_id, keyword_id.

Now what I want to do is grab any story that shares at least two keywords with the primary story I'm outputting from the initial loop of getting all stories for today.

// get list of keyword_ids for the primary story

// get list of all stories, and their keyword_ids, that match when I cycle through primary story keyword_ids

But I'm stuck, I can't think this out logically for some reason. Can someone advise? I'm not sure what other information I can provide to make this more clear.

I can get a li开发者_高级运维st of stories that match with 1 keyword from the primary story, but I want it to match at least two keywords in order to output the similar story.


You could use a keyword_id IN(...) clause to find stories that share any of the same keywords as your primary story. To find related stories with the highest relevancy you'd just sort the results by the number of keyword hits in descending order.

Something roughly like:

SELECT story_id, COUNT(story_id) AS relevancy 
FROM story_keywords 
WHERE keyword_id IN (...) 
GROUP BY story_id 
HAVING relevancy > 2 
ORDER BY relevancy DESC
LIMIT 5

You probably also want to add another WHERE clause to exclude the original story_id.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜