select equal sized number groups from a sequence of random numbers
say i have a list of 100 numbers, i want to split them into 5 groups which have the sum within each group closest to the mean of the numbers.
the simplest solution is to sort the hundred numbers and take th开发者_JAVA技巧e max number and keep adding the smallest numbers until the sum goes beyond the avg.
obviously that is not going to bring the best results. i guess we could use BFS or DFS or some other search algo. like A* to get the best result.
does anyone have a simple solution out there? pseudo code is good enough. thanks!
This sounds like a variation on the knapsack problem and if I'm interpreting you correctly, it may be the multiple knapsack problem. Can't you come up with an easy problem? :)
An efficient algorithm (solution) that can be used is a variation of the Best Fit of bin packing algorithm. However, we would have to be applying a variation in which we have specifically 5 different groups of numbers that we want rather than looking for using the smallest number of groups.
The algorithm begins by finding the mean of the list of all 100 numbers. This mean will be used as the max capacity for all 5 of the groups (bins) we are trying to fit numbers into. We then find the largest number in our list of 100 numbers that doesn’t exceed the max capacity of our group and assign it to our first group. (We can find this in log(n) time because we can use a self-balancing binary search tree). We keep track of the how filled our current group is. We then find the next largest number that fits into our current group until we have either reached max capacity or there are no other numbers that will allow this group to reach max capacity. In either of these cases we move on to the next group and repeat our algorithm with the numbers left in our list of numbers. Once we move on from a group we must also keep track of the current sum of that group. We continue until we have reached the highest capacity for all 5 groups. If there are any numbers left we place them in the groups that have the lowest total sum (because we kept track of these sums as we went). This is in fact a greedy algorithm with a Θ(nlogn) running time due to the nature of bin packing.
精彩评论