开发者

Select items based on popularity: Avoiding a glorified sort

I have a site where users can post and vote on suggestions. On the from page I initially list 10 suggestions and the header fetches a new random suggestion every 7 seconds.

I want the votes to influence the probability a suggestion will show up, both on the 10-suggestion list and in the header-suggestion. To that end I have a small algorithm to calculate popularity, taking into account votes, age and a couple other things (needs lots of tweaking).

Anyway, after running the algorithm I have a dictionary of suggestions and popularity index, sorted by popularity:

{ S = Suggestion1, P = 0.86  }
{ S = Suggestion2, P = 0.643 }
{ S = Suggestion3, P = 0.134 }
{ S = Suggestion4, P = 0.07  }
{ S = Suggestion5, P = 0.0   }
{ . . .}

I don't want this to be a glorified sort, so I'd like to introduce some random element to the selection process.

In short, I'd like the popularity to be the probability a suggestion gets picked out of the list.

Having a full list of suggestion/popularit开发者_开发技巧y, how do I go about picking 10 out based on probabilities? How can I apply the same to the looping header suggestion?


I'm afraid I don't know how to do this very fast, but if you have the collection in memory you can do it like this:

Note that you do not need to sort the list for this algorithm to work.

  1. First sum up all the probabilities (if the probability is linked to popularity, just sum the popularity numbers, where I assume higher values means higher probability)
  2. Calculate a random number in the range of 0 up to but not including that sum
  3. Start at one end of the list and iterate through it
  4. For each element, if the random number you generated is less than the popularity, pick that element
  5. If not, subtract the popularity of the element from the random number, and continue to the next

If the list is static, you could build ranges and do some binary searches, but if the list keeps changing, then I don't know a better way.

Here is a sample LINQPad program that demonstrates:

void Main()
{
    var list = Enumerable.Range(1, 9)
        .Select(i => new { V = i, P = i })
        .ToArray();
    list.Dump("list");

    var sum =
        (from element in list
         select element.P).Sum();

    Dictionary<int, int> selected = new Dictionary<int, int>();
    foreach (var value in Enumerable.Range(0, sum))
    {
        var temp = value;
        var v = 0;
        foreach (var element in list)
        {
            if (temp < element.P)
            {
                v = element.V;
                break;
            }

            temp -= element.P;
        }
        Debug.Assert(v > 0);
        if (!selected.ContainsKey(v))
            selected[v] = 1;
        else
            selected[v] += 1;
    }

    selected.Dump("how many times was each value selected?");
}

Output:

list 
[] (9 items)  
 V  P
 1  1 
 2  2 
 3  3 
 4  4 
 5  5 
 6  6 
 7  7 
 8  8 
 9  9 
45 45  <-- sum

how many times was each value selected? 
Dictionary<Int32,Int32> (9 items)  
Key Value
 1    1 
 2    2 
 3    3 
 4    4 
 5    5 
 6    6 
 7    7 
 8    8 
 9    9 
     45 <-- again, sum
 
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜