Hockey pool Algorithm
This is a little fun project that I have started to try and maximize my chances of winning our office hockey pool. I'm trying to find the best way to select 20 players that will give me the most points within a maximum salary cap.
For example, imagine that the raw data is made of
- The Player Name
- Position (Forward, Defensemen, Goalie)
- Predicted amount of points for this season
- Salary.
Now I want the 20 players that will give me the most point within X salary cap. Later on, as a phase 2, I would like to do the same thing however in this case, I only want 12 forwards, 6 defensemen and 2 goalie.
Now the obvious way is to simply go every possible combination, however while this will work it is not a valid option as with 500 players, this would have too many possible combination. I could add some smart filters to reduce the 500 players to the top 50 forwards, top 30 defensemen and top 15 goalies but still, this would be a very slow process.
I'm wondering if there are other algorithms out there to implement this. This is just for fun and not an important business request. But if you have some thoughts on how to proceed, please let me know.
My first try was using the knapsack algorith with some help from other sources. It seems to work with just the Salary as a parameter. I'm struggling at figuring out how to add the 20 player team parameter. Its in .Net but should be easy to convert to Java.
I was thinking of doing a seperate loop to figure out the best teams with 20 players reguardless of salary and then compare the two list until I find the team that is highest on the two list. Not sure.
namespace HockeyPoolCalculator
{
public class ZeroOneKnapsack
{
protected List<Item> itemList = new List<Item>();
protected int maxSalary = 0;
protected int teamSize = 0;
protected int teamSalary = 0;
protected int points = 0;
protected bool calculated = false;
public ZeroOneKnapsack() { }
public ZeroOneKnapsack(int _maxSalary)
{
setMaxSalary(_maxSalary);
}
public ZeroOneKnapsack(List<Item> _itemList)
{
setItemList(_itemList);
}
public ZeroOneKnapsack(List<Item> _itemList, int _maxSalary)
{
setItemList(_itemList);
setMaxSalary(_maxSalary);
}
// calculte the solution of 0-1 knapsack problem with dynamic method:
public virtual List<Item> calcSolution()
{
int n = itemList.Count;
setInitialStateForCalculation();
if (n > 0 && maxSalary > 0)
{
List<List<int>> playerList = new List<List<int>>();
List<int> salaryList = new List<int>();
//initialise list
playerList.Add(salaryList);
for (int j = 0; j <= maxSalary; j++)
salaryList.Add(0);
// Loop through players
for (int i = 1; i <= n; i++)
{
List<int> prev = salaryList;
playerList.Add(salaryList = new List<int>());
for (int j = 0; j <= maxSalary; j++)
{
if (j > 0)
{
int wH = itemList.ElementAt(i - 1).getSalary();
// Is the players salary more than the current calculated salary? If yes, then keep current max points, else get the highest amount between previous max points at that salary and new max points.
salaryList.Add((wH > j)?prev.ElementAt(j): Math.Max(prev.ElementAt(j),itemList.ElementAt(i - 1).getPoints() + prev.ElementAt(j - wH)));
}
else
{
salaryList.Add(0);
}
} // for (j...)
} // for (i...)
points = salaryList.ElementAt(maxSalary);
for (int i = n, j = maxSalary; i > 0 && j >= 0; i--)
{
int tempI = playerList.ElementAt(i).ElementAt(j);
int tempI_1 = playerList.ElementAt(i - 1).ElementAt(j);
if ((i == 0 && tempI > 0)||(i > 0 && tempI != tempI_1))
{
Item iH = itemList.ElementAt(i - 1);
int wH = iH.getSalary();
iH.setInKnapsack(1);
j -= wH;
teamSalary += wH;
}
} // for()
calculated = true;
} // if()
return itemList;
}
// add an item to the item list
public void add(String name, int Salary, int value)
{
if (name.Equals(""))
name = "" + (itemList.Count() + 1);
itemList.Add(new Item(name, Salary, value));
setInitialStateForCalculation();
}
// add an item to the item list
public void add(int Salary, int value)
{
add("", Salary, value); // the name will be "itemList.size() + 1"!
}
// remove an item from the item list
public void remove(String name)
{
for (int pointer = 0; pointer <= itemList.Count-1; pointer++)
{
itemList[pointer].getName().Equals("");
if (itemList.ElementAt(pointer).getName().Equals(itemList.ElementAt(pointer).getName()))
{
itemList.Remove(itemList.ElementAt(pointer));
}
}
setInitialStateForCalculation();
}
// remove all items from the item list
public void removeAllItems()
{
itemList.Clear();
setInitialStateForCalculation();
}
public int getPoints()
{
if (!calculated)
calcSolution();
return points;
}
public int getSolutionSalary() { return teamSalary; }
public bool isCalculated() { return calculated; }
public int getMaxSalary() { return maxSalary; }
public void setTeamSize(int _teamSize)
{
teamSize = _teamSize;
}
public int getTeamSize()
{
return teamSize;
}
public void setMaxSalary(int _maxSalary)
{
maxSalary = 开发者_运维知识库Math.Max(_maxSalary, 0);
}
public void setItemList(List<Item> _itemList) {
if (_itemList != null) {
itemList = _itemList;
foreach (Item item in _itemList) {
item.checkMembers();
}
}
}
// set the member with name "inKnapsack" by all items:
private void setInKnapsackByAll(int inKnapsack) {
foreach (Item item in itemList)
if (inKnapsack > 0)
item.setInKnapsack(1);
else
item.setInKnapsack(0);
}
// set the data members of class in the state of starting the calculation:
protected void setInitialStateForCalculation()
{
setInKnapsackByAll(0);
calculated = false;
points = 0;
teamSalary = 0;
teamSize = 0;
}
}
}
Thanks for your help!
Unfortunately, you shouldn't expect to find a good solution to this problem as it is NP-hard. Unless P = NP, there aren't any polynomial-time algorithms, and exhaustive search is likely to be one of the best algorithms (though you might use some heuristics to speed it up).
To see that this problem is NP-hard, we'll show how to reduce the knapsack problem to it in polynomial time. Given any instance of the subset sum problem consisting of a set S = {(weight1, value1), (weight2, value2), ... , (weightn, valuen)} and weight limit k, we can construct an instance of your hockey problem by creating a set of players whose salary is the weight and whose expected points is the value. We then try to find the maximum-weight combination of players whose salary doesn't exceed k, which is then identical to the maximum sum you can make in the original knapsack problem that doesn't exceed the target weight.
As the code you've posted shows, though, there is a pseudopolynomial time algorithm for solving the knapsack problem. Assuming salaries are low (or that you can normalize them to small numbers), you might be able to use this to good effect.
While it's doubtful that there is a polynomial-time algorithm to get the exact answer, if you're okay with an approximately optimal solution, there is a polynomial-time algorithm for approximating solutions to the knapsack problem. For details, check out these notes, which detail two algorithms. Interestingly, they rely on the pseudopolynomial time algorithm that you seem to be using, so perhaps they'll be easy to implement?
Sorry to dash your hopes for a nice solution with math... NP-hard problems tend to do that. :-(
Plot the players on a graph, X-axis is points, Y-axis is salary, zeroes at the origin. Lower right players will be desirable cheap high scorers, upper left players will be undesirable expensive low scorers, players on the diagonal will be equivalent (same cost per point). Sweep an origin-anchored radius from the X-horizontal counter-clockwise to the Y-vertical, forming an ever-growing pie slice, until either 20 players are inside the slice or the total salaries inside the slice reaches the cap. If you reach the $ cap but not 20 players, delete the "upper leftmost" player inside the slice and continue. If you reach 20 players but not the cap, you can delete a low-scorer from the slice to make room for a higher-scoring player just about to enter the slice, making your total cost per point unnecessarily rise, but since it's funny money and you're not a real owner, go for it.
A fun way to tackle that kind of problem is to use a genetic algorithm. And actually, I did just that for my own hockey pool!
You can see the whole Scala code here if you are curious, but the heart of it is:
def GA(nbIter: Int, popSize: Int, tournamentSize: Int) = {
def loop(iter: Int, pop: Seq[LineUp]): LineUp = {
val best = pop.maxBy(_.fitness)
println("\nBest of generation " + iter + best)
if (iter == nbIter)
best
else {
val newPop = for {
_ ← Stream.continually()
x = tournament(pop, tournamentSize).head
y = (x.mutants take 5) maxBy (_.fitness)
} yield y
loop(iter + 1, best +: newPop.take(popSize - 1))
}
}
val initialPop = LineUp.randomStream.take(popSize)
loop(0, initialPop)
}
It starts by generating a random collection of valid lineups (respecting salary cap and position constraints). For each iteration, it then generates a new population by using a combination of tournament selection and hill climbing. "Fitness" is simply defined as the expected number of points for a lineup with the lowest salary as a tie breaker.
It's just a heuristic, of course, but if your list of players is not too big and you can afford to let the algorithm run for a while, there is a good chance you'll find a fairly optimal solution.
The problem might be hard, but you can use the fact that your goal-keeper is not gonna play in offense (more formal: the entities you select are of different categories) in order to improve your runtime.
Also, you might want to find an approximate solution to your problem. Use that value to bound the optimal solution and the search for it.
You can easily formulate it as ILP. Solving them is also NP-Hard, but the problem instance seems not that big, so it maybe solvable perfect (one solver is e.g. lpsolve). Even if it is not solvable perfect because of the problem size, there exist good heuristics for ILP.
精彩评论