Memory Choke on Branch And Bound Knapsack Implementation
I wrote this implementation of the branch and bound knapsack algorithm based on the pseudo-Java code from here. Unfortunately, it's memory choking on large instances of the problem, like this. Why is this? How can I make this implementation more memory efficient?
The input on the file on the link is formatted this way:
numberOfItems maxWeight
profitOfItem1 weightOfItem1
.
.
.
profitOfItemN weightOfItemN
// http://books.google.com/books?id=DAorddWEgl0C&pg=PA233&am开发者_如何学运维p;source=gbs_toc_r&cad=4#v=onepage&q&f=true
import java.util.Comparator;
import java.util.LinkedList;
import java.util.PriorityQueue;
class ItemComparator implements Comparator {
public int compare (Object item1, Object item2){
Item i1 = (Item)item1;
Item i2 = (Item)item2;
if ((i1.valueWeightQuotient)<(i2.valueWeightQuotient))
return 1;
if ((i2.valueWeightQuotient)<(i1.valueWeightQuotient))
return -1;
else { // costWeightQuotients are equal
if ((i1.weight)<(i2.weight)){
return 1;
}
if ((i2.weight)<(i1.weight)){
return -1;
}
}
return 0;
}
}
class Node
{
int level;
int profit;
int weight;
double bound;
}
class NodeComparator implements Comparator {
public int compare(Object o1, Object o2){
Node n1 = (Node)o1;
Node n2 = (Node)o2;
if ((n1.bound)<(n2.bound))
return 1;
if ((n2.bound)<(n1.bound))
return -1;
return 0;
}
}
class Solution {
long weight;
long value;
}
public class BranchAndBound {
static Solution branchAndBound2(LinkedList<Item> items, double W) {
double timeStart = System.currentTimeMillis();
int n = items.size();
int [] p = new int [n];
int [] w = new int [n];
for (int i=0; i<n;i++){
p [i]= (int)items.get(i).value;
w [i]= (int)items.get(i).weight;
}
Node u;
Node v = new Node(); // tree root
int maxProfit=0;
int usedWeight=0;
NodeComparator nc = new NodeComparator();
PriorityQueue<Node> PQ = new PriorityQueue<Node>(n,nc);
v.level=-1;
v.profit=0;
v.weight=0; // v initialized to -1, dummy root
v.bound = bound(v,W, n, w, p);
PQ.add(v);
while(!PQ.isEmpty()){
v=PQ.poll();
u = new Node();
if(v.bound>maxProfit){ // check if node is still promising
u.level = v.level+1; // set u to the child that includes the next item
u.weight = v.weight + w[u.level];
u.profit = v.profit + p[u.level];
if (u.weight <=W && u.profit > maxProfit){
maxProfit = u.profit;
usedWeight = u.weight;
}
u.bound = bound(u, W, n, w, p);
if(u.bound > maxProfit){
PQ.add(u);
}
u = new Node();
u.level = v.level+1;
u.weight = v.weight; // set u to the child that does not include the next item
u.profit = v.profit;
u.bound = bound(u, W, n, w, p);
if(u.bound>maxProfit)
PQ.add(u);
}
}
Solution solution = new Solution();
solution.value = maxProfit;
solution.weight = usedWeight;
double timeStop = System.currentTimeMillis();
double elapsedTime = timeStop - timeStart;
System.out.println("* Time spent in branch and bound (milliseconds):" + elapsedTime);
return solution;
}
static double bound(Node u, double W, int n, int [] w, int [] p){
int j=0; int k=0;
int totWeight=0;
double result=0;
if(u.weight>=W)
return 0;
else {
result = u.profit;
totWeight = u.weight; // por esto no hace
if(u.level < w.length)
{
j= u.level +1;
}
int weightSum;
while ((j < n) && ((weightSum=totWeight + w[j])<=W)){
totWeight = weightSum; // grab as many items as possible
result = result + p[j];
j++;
}
k=j; // use k for consistency with formula in text
if (k<n){
result = result + ((W - totWeight) * p[k] / w[k]);// grab fraction of excluded kth item
}
return result;
}
}
}
I got a slightly speedier implementation taking away all the Collection instances with generics and instead using arrays.
Not sure whether you still need insight into the algorithm or whether your tweaks have solved your problem, but with a breadth-first branch and bound algorithm like the one you've implemented there's always going to be the potential for a memory usage problem. You're hoping, of course, that you'll be able to rule out a sufficient number of branches as you go along to keep the number of nodes in your priority queue relatively small, but in the worst-case scenario you could end up with up to as many nodes as there are possible permutations of item selections in the knapsack held in memory. The worst-case scenario is, of course, highly unlikely, but for large problem instances even an average tree could end up populating your priority queue with millions of nodes.
If you're going to be throwing lots of unforeseen large problem instances at your code and need the piece of mind of knowing that no matter how many branches the algorithm has to consider you'll never run out of memory, I'd consider a depth-first branch and bound algorithm, like the Horowitz-Sahni algorithm outlined in section 2.5.1 of this book: http://www.or.deis.unibo.it/knapsack.html. For some problem instances this approach will be less efficient in terms of the number of possible solutions that have to be considered before the optimal one is found, but then again for some problem instances it will be more efficient--it really depends on the structure of the tree.
精彩评论