Wikipedia A* pathfinding algorithm takes a lot of time
I've successfully implemented A* pathfinding in C# but it is very slow, and I don't understand why. I even tried not sorting the openNodes list but it's still the same.
The map is 80x80, and there are 10-11 nodes.
I took the pseudocode from here Wikipedia
And this is my implementation:
public static List<PGNode> Pathfind(PGMap mMap, PGNode mStart, PGNode mEnd)
{
mMap.ClearNodes();
mMap.GetTile(mStart.X, mStart.Y).Value = 0;
mMap.GetTile(mEnd.X, mEnd.Y).Value = 0;
List<PGNode> openNodes = new List<PGNode>();
List<PGNode> closedNodes = new List<PGNode>();
List<PGNode> solutionNodes = new List<PGNode>();
mStart.G = 0;
mStart.H = GetManhattanHeuristic(mStart, mEnd);
solutionNodes.Add(mStart);
solutionNodes.Add(mEnd);
openNodes.Add(mStart); // 1) Add the starting square (or node) to the open list.
while (openNodes.Count > 0) // 2) Repeat the following:
{
openNodes.Sort((p1, p2) => p1.F.CompareTo(p2.F));
PGNode current = openNodes[0]; // a) We refer to this as the current square.)
if (current == mEnd)
{
while (current != null)
{
solutionNodes.Add(current);
current = current.Parent;
}
return solutionNodes;
}
openNodes.Remove(current);
closedNodes.Add(current); // b) Switch it to the closed list.
List<PGNode> neighborNodes = current.GetNeighborNodes();
double cost = 0;
bool isCostBetter = false;
for (int i = 0; i < neighborNodes.Count; i++)
{
PGNode neighbor = neighborNodes[i];
cost = current.G + 10;
isCostBetter = false;
if (neighbor.Passable == false || closedNodes.Contains(neighbor))
continue; // If it is not walkable or if it is on the closed list, ignore it.
if (openNodes.Contains(neighbor) == false)
{
openNodes.Add(neighbor); // If it isn’t on the open list, add it to the open list.
isCostBetter = true;
}
else i开发者_高级运维f (cost < neighbor.G)
{
isCostBetter = true;
}
if (isCostBetter)
{
neighbor.Parent = current; // Make the current square the parent of this square.
neighbor.G = cost;
neighbor.H = GetManhattanHeuristic(current, neighbor);
}
}
}
return null;
}
Here's the heuristic I'm using:
private static double GetManhattanHeuristic(PGNode mStart, PGNode mEnd)
{
return Math.Abs(mStart.X - mEnd.X) + Math.Abs(mStart.Y - mEnd.Y);
}
What am I doing wrong? It's an entire day I keep looking at the same code.
First off, use a profiler. Use tools to tell you what is slow; it is often surprising what is slow. And use a debugger. Make a map with five nodes in it and step through every line of the code as it tries to find the path. Did anything unexpected happen? If so, figure out what is going on.
Second, leaving aside your performance problem, I think you also have a correctness problem. Can you explain to us why you believe the Manhattan distance is a reasonable heuristic?
(For those reading this who are unfamiliar with the metric, the "Manhattan distance" or "taxi distance" is the distance between two points you'd have to travel if you lived in a city laid out on a grid. That is, to go 14 miles due northeast you'd have to travel 10 miles north and then 10 miles east for a total of 20 miles.)
The A* algorithm requires for its correctness that the heuristic always underestimates the actual distance required to travel between two points. If there are any "diagonal shortcut" streets in the graph then the Manhattan distance overestimates the distance on those paths and therefore the algorithm will not necessarily find the shortest path.
Why aren't you using the Euclidean distance as your heuristic?
Have you tried your algorithm using "always zero" as the heuristic? That is guaranteed to be an underestimate. (Doing so gives you an implementation of Dijkstra's Algorithm.)
Third, you seem to be doing an awful lot of sorting in this implementation. Surely you could be using a priority queue; that's a lot cheaper than sorting.
Fourth, I have an implementation of A* in C# 3 on my blog that you are welcome to use; no warranty expressed or implied, use at your own risk.
http://blogs.msdn.com/b/ericlippert/archive/tags/astar/
My code is very straightforward; the algorithm in my implementation looks like this:
var closed = new HashSet<Node>();
var queue = new PriorityQueue<double, Path<Node>>();
queue.Enqueue(0, new Path<Node>(start));
while (!queue.IsEmpty)
{
var path = queue.Dequeue();
if (closed.Contains(path.LastStep)) continue;
if (path.LastStep.Equals(destination)) return path;
closed.Add(path.LastStep);
foreach(Node n in path.LastStep.Neighbours)
{
double d = distance(path.LastStep, n);
var newPath = path.AddStep(n, d);
queue.Enqueue(newPath.TotalCost + estimate(n), newPath);
}
}
The idea is that we maintain a priority queue of paths; that is, a queue of paths that is always able to tell you the path so far with the least distance. Then we check to see if we've made it to our destination; if so, we're done. If not, then we enqueue a bunch of new paths based on their (under)estimated distance to the goal.
Fifth, that pseudocode in Wikipedia could be improved. The fact that my actual code is in many ways easier to follow than the pseudocode indicates that maybe there is too much detail in the pseudocode.
A couple notes:
List<T>
is not optimized for removing the first element. Better would be to sort in the opposite order and take the last element. Or use a Stack<T>
or Queue<T>
.
List.Remove(current)
is extremely inefficient. You already know the index you want to remove, don't search the entire list for the element.
Keeping openNodes
sorted by inserting new nodes in the correct location should be much faster than resorting the entire list continually. Skipping the list sort breaks the entire algorithm by removing important invariants. You need to make the sort faster, not skip it.
The primary operation you're doing on closedNodes
is a presence test, closedNodes.Contains()
. Use a data structure that is optimized for that, e.g. Set<T>
. Or better yet, put a closed flag field in each node and clear them all at the beginning of each pass. This will be significantly faster than doing a linear search through closedNodes
in each iteration.
You shouldn't put anything in solutionNodes
initially, both mEnd
and mStart
will get added during the final loop that traverses the path.
neighborNodes
could be an IEnumerable<T>
instead of a List<T>
, since you never need the entire list at once. Using foreach
would also be slightly faster than enumerating the list by index.
You can compare it with (or just make use of) the A* implementation in quickgraph library:
QuickGraph.Algorithms.ShortestPath.AStarShortestPathAlgorithm<TVertex,TEdge>
You compute traversal node cost like this:
cost = current.G + 10;
Yet for heuristics you have a simple distance. Why not using the same distance even here? Depending on how far your nodes currently are, your heuristics may be a lot too low.
Another "detail" which might be wrong: current.GetNeighborNodes
. How is this implemented? Does it return the same node as it should for the same location, so that the same node on different paths is shared, or does it always allocated a new node using new?
What is the memory consumption like? Download the Red Gate tools. Use Performance Profiler to see where the most time is being spent and Memory Profiler to determine if you have any issues with memory leaks or object not being disposed quickly enough.
As @Rodrigo pointed out, you could have a large map to deal with. Nested loops are never expected to be performant.
Are you using a grid for your terrain representation ? If so, then the best heuristic in that case is Octile:
Heuristic Cost= (min(Differrence in x, Differrence in y) * square root of 2 + max(Differrence in x, Differrence in y) - min(Differrence in x, Differrence in y))
For grids, this will always be optimal. Unfortunate this heuristic isn't so well known.
Another useful tip is to choose a data structure for your open list to suit the size of your map. If your map is relatively small (100 x 100), then an unsorted vector will be the fastest way to go. To remove elements, simply do an iterator swap of the last element and the one you want to remove and call pop_back. If you have a bigger map, use a heap. You only care about the cheapest node, so sorting everything else will have no benefit. A heap inserts and sorts with complexity log n, perfect for medium and large data sets but slow for small ones.
Lastly, if speed is so great an issue, implemente Jump Point Search. It is, on average, 20 to 30 times faster than pathfinding A*, with no memory overhead (or so the research paper claims, haven't found any proof about that one). It will basically substitute the "find neighbors" step of A* with "find successors", so incorporating it into your code should be relatively simple.
Hope that helps.
精彩评论