Matching weighted tags in a closest-first manner

2023-03-22 13:28 问答作者：

Bit of an open-ended, how would you approach this type of situation, question.

I'm building a system in which the user is asked to select any number of items from a list of categories. For each category they select, they are asked to assign a weight to it (a value or 1-100 of importance). I guess the best way of describing these user-categories is weighted tags. So, I might really enjoy eating bananas, that gets 100, where as apples I quite enjoy, gets 50. I hate plums, so I don't select that.

Certain other entities in the system will be doing exactly the same and will have their own set of tags, each with a weight. In the above scenario, an item may be a "Farm", and their output of each type of fruit is the weighting values. What I want to find is the best matching farms for the user's taste in fruits (for example). This may look something like:

User A: [Tag1: 100, Tag2: 50, Tag4: 10]

Item A: [Tag2: 40, Tag3: 20]

Item B: [Tag1: 100, Tag2: 50, Tag4: 10]

Item C: [Tag3: 20, Tag4: 5]

In this situation, Item B is obviously a perfect match for User A, so would be top of the result set. What I really want, is a system that can order the items in decreasing relevance against a specific user.

I've toyed around with SQL and NoSQL (redis) implementations, attempting a solution, but e开发者_StackOverflow中文版ach time, I find myself iterating through a rather large dataset and doing basic math against each tag in each item to calculate the overall difference. Whilst this works, it's going to be slow, and if we're talking about a system with thousonds of "Items", I'd imagine this would be brought to it's knees fairly quickly.

I can't think of a way to implement this directly in SQL, given that there two many-to-many style relationships involved across three entities (Item, User, Category/Tag). I can't even begin to wrap my head around how the weighting values from the ajoining tables User-Category and Item-Category could be compared in SQL to produce a final output.

I guess what I'm asking for, is a few ideas at how to even approach this idea.

Cheers John

The problem you're trying to solve looks related to the nearest neighbor problem, which for tagged data like you've mentioned can be solved using a variety of data structures. I'm not much of a SQL person, but I bet that if you search for nearest-neighbor algorithms you will find something that looks like what you want.

继续阅读：algorithm computer-science relational-database

Matching weighted tags in a closest-first manner

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？