Incorrect results for average distance using Dijkstra (java)
The graph is unweighed, an element of the array of HashSets neighbours[] is a node neighbours[1] is node 1 (they start from 0 mind you) with its unique neighbouring nodes say 2 3 4 5. (so neighbours[5] will contain 1). And I have the following method I did with great deal of help as I dont get the algo much beyond theory. The number it returns should be the average distance between 2 nodes in the graph.
Imagine I have the following graph (node: in_links | out_links; neighbours[] does not contain the 0 loops at node 0, and no duplicates as I said.)
0: 0 0 0 | 0 0 0 1 1 1 2 3 5 6 7 7 8 8 9 9开发者_如何学JAVA 11
1: 0 0 0 | 2 2 3 4 4 5 6 8
2: 0 1 1 | 3
3: 0 1 2 | 4 9
4: 1 1 3 | 5 12
5: 0 1 4 | 6 7 10
6: 0 1 5 | 10 11 12
7: 0 0 5 |
8: 0 0 1 | 10
9: 0 0 3 | 12
10: 5 6 8 | 11
11: 0 6 10 |
12: 4 6 9 |
And for this trivial graph the distance that's returned is 5.781686749230769E8 ?!?! the code:
public double getAvgDistance() {
double total = 0;
int[] dist = new int[n];
ArrayList<Integer> Q = new ArrayList<Integer>();
int tmp, index = 0, w = 0;
for (int u=0; u<n; u++) {
System.out.print("Avg Dist at "+u+"\r");
// Initialise Q and dist for this iteration
for (int v=u+1; v<n; v++) {
Q.add(v);
if (neighbours[u].contains(v)) {
dist[v] = 1;
} else {
dist[v] = Integer.MAX_VALUE;
}
}
while (!Q.isEmpty()) {
tmp = dist[0];
for (int e=1; e<Q.size(); e++) {
if (dist[e] < tmp) {
w = Q.get(e);
tmp = dist[w]; // smallest dist is for this element w so far
index = e;
}
}
Q.remove(index);
for (int z : neighbours[w]) {
if ( Q.contains(z)
&& (dist[w]+1 < dist[z]) ) {
dist[z] = dist[w]+1;
}
}
} // while end
for (int v = u+1; v < n; v++ ) {
total += dist[v];
}
} // for 0-n end
return total /= (double)(n*(n-1)/2);
}
I don't have much experience with casting or printing real numbers so I hope its something to do with those! All comments welcome
If I understand your question correctly, nodes 7, 11 and 12 have no out links and therefore no valid paths to the other nodes.
Does your algorithm force a path by inserting a link with a cost of Integer.MAX_VALUE in these cases? If so, that would explain why you have such a very high average cost.
I also wondered whether it would be better to evaluate both forward and reverse paths. In a directed graph the cost of path AB is not necessarily the same as the cost of path BA. With your current algorithm, the cost of every path ending at node 12 is calculated, but no paths starting at node 12 are evaluated.
I'm only guessing, but I see distances occasionally being set to Integer.MAX_VALUE
. If those numbers are actually entering the result and then being divided down by one or two factors of 10, that would explain very nicely why the average is much, much larger than expected and roughly in the same ballpark as MAX_VALUE.
It's OK to have this large value in your graph when it is used to determine the shortest path among alternatives, but once you get to the point where you are determining actual distances, that number has to go!
Either you have a path length in the ballpark of MAX_VALUE, that says there is no path. That path length therefore doesn't go into your average. Or your path length is a small integer in the same magnitude as your in-graph distances, then it's valid and you can include it in your calculations.
The lesson to be taken home from this: Just because a number came out of a computer program that doesn't mean it's trustworthy or correct!
I'm not sure if I totally understand your question, but it sounds like you MAY be having trouble with the value you are printing not being exactly what you expect. I suspect the problem may be when you are printing the value of the double. Whenever you convert a double directly to a String, I know you can get unexpected results.
This post suggests using a BigDecimal instead of a double to maintain precision: Retain precision with double in Java
So perhaps try doing something like the following and see if you have better results.
BigDecimal.valueOf(<your double value here>).toPlainString();
精彩评论