Algorithm for filling a matrix of item, item pairs

2023-01-21 05:45 问答作者：

Hey guys, I have a sort of speed dating type application (not used for dating, just a similar concept) that compares users and matches them in a round based event.

Currently I am storing each user to user comparison (using cosine similarity) and then finding a round in which both users are available. My current set up works fine for smaller scale but I seem to be missing a few matchings in larger data sets.

For example with a setup like so (assuming 6 users, 3 from each group)

Round (User1, User2)
----------------------------
1  (x1,y1)  (x2,y2)  (x3,y3)
2  (x1,y2)  (x2,y3)  (x3,y1)
3  (x1,y3)  (x2,y1)  (x3,y2)

My approach works well right now to ensure I have each user meeting the appropriate user without having overlaps so a user is left out, just not with larger data sets.

My current algorithm

I store a comparison of each user from x to each user from y like so

Round, user1, user2, similarity

And to build the event schedule I simply sort the comparisons by similarity and then iterate over the results, finding an open round for both users, like so:

event.user_maps.all(:order => 'similarity desc').each do |map|
  (1..event.rounds).each do |round|
    if user_free_in_round?(map.user1) and user_free_in_round?(map.user2)
      #creates the pairing and breaks from the loop
    end
  end
end

This isn't exact code but the general algorithm to build the schedule. Does anyone know a better way of filling in a matrix of item pairings where no one item can be in more than one place in the same slot?

EDIT

For some clarification, the issue I am having is that in larger sets my algorithm of placing highest similarity matches first can sometimes result in collisions. What I mean by that is that the users are paired in such a way that they have no other user to meet with.

Like so:

Round (User1, User2)
----------------------------
1  (x1,y1)  (x2,y2)  (x3,y3)
2  (x1,y3)  (x2,nil)  (x3,y1)
3  开发者_开发百科(x1,y2)  (x2,y1)  (x3,y2)

I want to be able to prevent this from happening while preserving the need for higher similar users given higher priority in scheduling.

In real scenarios there are far more matches than there are available rounds and an uneven number of x users to y users and in my test cases instead of getting every round full I will only have about 90% or so of them filled while collisions like the above are causing problems.

I think the question still needs clarification even after edit, but I could be missing something.

As far as I can tell, what you want is that each new round should start with the best possible matching (defined as sum of the cosine similarities of all the matched pairs). After any pair (x_i,y_j) have been matched in a round, they are not eligible for the next round.

You could do this by building a bipartite graph where your Xs are nodes in one side and Ys are nodes in another side, and the edge weight is cosine similarity. Then you find the max weighted match in this graph. For the next rounds, you eliminate the edges that have already been used in previous round and run the matching algorithm again. For details on how to code max weight matching in bipartite graph, see here.

BTW, this solution is not optimum since we are proceeding from one round to next in a greedy fashion. I have a feeling that getting the optimum solution would be NP hard, but I don't have a proof so can't be sure.

I agree that the question still needs clarification. As Amit expressed, I have a gut feeling that this is an NP hard problem, so I am assuming that you are looking for an approximate solution.

That said, I would need more information on the tradeoffs you would be willing to make (and perhaps I'm just missing something in your question). What are the explicit goals of the algorithm?

Is there a lower threshold for similarity below which you don't want a pairing to happen? I'm still a bit confused as to why there would be individuals which could not be paired up at all during a given round...

Essentially, you are performing a search over the space of possible pairings, correct? Maybe you could use backtracking or some form of constraint-based algorithm to make sure that you can obtain a complete solution for a given round...?

继续阅读：algorithm matrix ruby

Algorithm for filling a matrix of item, item pairs

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？