Data structure, best practice to build true "related by tags" items list?
What I mean saying true "related by tags" list?
Let's imagine that article have 3 tags: A, B, C. True "related by tags" articles for this item will be articles fistly having A, B, C tags, then (A, B), (A, C), (B, C) etc.
table: tags
tag_id
tag_title
tag_nicetitle
table: tags2articles
article_id
tag_id
Using this tables structure is too difficult to calculate true "related by tags".
We can add one more table containing article_id and it's md5(A,B,C). Before hashing we should sort tags by alphabet.
table: article_tags_hashed
id
article_id
md5
count
This table will help us to find articles containing exact set of tags (A,B,C), but it won't help to find art开发者_开发知识库icles containing only (A, B), (A, C), (B, C)
What is the best practice?
PS: Sorry for my english, it's pretty bad.
I don't think you need article_tags_hashed because you can group and count the results when you
query the tags2articles.
Example:
select article_id, count(article_id) as tagcnt from tags2articles
where tag_id in (...)
group by article_id
order by tagcnt desc
The articles which contain the most tags will be placed first.
精彩评论