How to recalculate all-pairs shorthest paths on-line if nodes are getting removed?
Latest news about underground bombing made me curious about the following problem. Assume we have a weighted undirected graph, nodes of which are sometimes removed. The problem is to re-calculate shortest paths between all pairs of nodes fast after such removals.
With a simple modification of Floyd-Warshall algorithm we can calculate shortest paths between all pairs. These paths may be stored in a table, where shortest[i][j]
contains the index of the next node on the shortest path between i
and j
(or NULL
value if there's no path). The algorithm requires O(n³) time to build the table, and eacho query shortest(i,j)
takes O(1). Unfortunately, we should re-run this algorithm after each removal.
The other alternative is to run graph search on each query. This way each removal takes zero time to update an auxiliary structure (because there's none), but each query takes O(E) time.
What algorithm can be used to "balance" the query and update time for all-pairs shortest-paths problem when nodes of the graph are being开发者_如何学Python removed?
As mentioned by goran, you only have to recalculate those shortest paths which contained a node that was removed in the meanwhile. So, I'd do the following:
- Calculate all the shortest paths using the Floyd-Warshall algorithm. This is O(n3).
- Create an index which assigns vertex indices to the list of shortest paths that vertex participates in. (In other words, for every vertex i, it gives you a list of (j, k) pairs s.t. i is present in the shortest path between vertex j and k). This is O(n2d) (where d is the graph diameter) since you have to loop over each vertex in each of the shortest path determined in the previous step, and there are n2 such shortest paths.
- After a vertex removal, look up the list of the affected shortest paths from the index created in step 2 and recalculate them. Since you don't need all the shortest paths here and I assume that only a few nodes are affected, you are probably better off with multiple breadth first searches, which is O(m + n) where m is the number of edges.
I'll answer the implied non-generic variant of the problem, where the graph is a road network. This seems to me one of the cases where theoretical bounds on arbitrary graphs are less interesting.
Some of the best approaches for shortest path finding are highway-node routing and transit-node routing. (described in a dissertation by Dominik Schultes). The acceleration structure can be built reasonably fast (~15 min. for the HNR of the whole Europe) and it supports incremental updates. What's interesting is that query times for random queries is around 1ms, and all-pairs distance tables can be calculated for as low as 0.2us per entry. What's even more interesting is the query performance scaling, data suggests that the scaling is sub-logarithmic/near-constant in graph size. Transit node routing is even faster at 4.3us random queries. Alas I don't have any information on the updatability of the datastructure. Intuitively it shouldn't be too hard to incrementally update.
The same or similar approaches could be used on other graphs that are similar to road networks (sparse, planar, hierarchical).
Well I don't think that you need to rerun the whole build if a node is removed. Only for those paths which had a node in them.
Trivial solution to your problem would be to add information about redundant paths, for example you could have a second shortest path for each of the nodes on the shortest path. This does not reduce the computation time, but caches it (it also increase storage requirements by a factor of n, where n is average number of nodes, for a single node removal, by factor of n*(n-1) for 2 node redundancy... and by factor of n! for complete redundancy and also increases initial build time by the same factor and also increases the time needed to add nodes)
If there is a way to improve this and reduce the rebuild time then you need to find a structure that would allow a quick computation of the shortest distance but that should be immutable (or cheap to recalculate) when nodes are removed or added to a graph.
Quick concept of Floyd-Warshall Algorithm! The first loop in the algorithm is a new node which is introduced in addition to the existing nodes. We will run the algorithm to relax all the existing node distances with the new introduced node.
In this problem we will reverse the order of removal of the nodes and hence the removed nodes will act as an adding node.
-- given the deleted array del[n];
-- we will traverse from the end of deleted array.
- for(int k=n; k>=1; k--)
- for(int i=1; i<=n; i++)
- for(int j=1; j<=n; j++)
- graph[i][j] = min(graph[i][j],graph[i][del[k]] + graph[del[k]][j]);
-- Now traverse through the all k to n nodes and find the required all pair distances in each iteration and store it in a vector.
-- finally, reverse the vector to get the required all pair distances.
So running time of the algorithm is as the same as Floyd Warshall algorithm, i.e. O(n3).
精彩评论