Path Problems in Directed Graphs
The weight of an edge in a directed graph is often thought of as its length. The length of a path <v0, v1, ..., vn> is the sum of the lengths of all component edges <vi,vi+1>. Finding the shortest paths between vertices in a graph is an important class of problem.
Single Source Shortest Paths in a Directed Graph
As an example of a path problem, the fire-brigade keeps a map of the city marked with the locations of especially hazardous sites, such as chemical stores. They wish to know the shortest route from the fire-station to each site. Note the "length" of a road might be either its physical length or the estimated driving time on it, which are not necessarily proportional to each other.
It turns out that it as easy to find the shortest paths from a single source to all other vertices as it is to find the shortest path between any two vertices. Usually the source is taken to be v1. Dijkstra's algorithm solves this single-source shortest paths problem in O(|V|2) time. It operates by enlarging the set of vertices `done' for which the shortest paths from the source are known. Initially done contains just the source v1. At an intermediate stage, the vertex not in the set done that is closest to the source is found and added to done. This allows our knowledge of the shortest paths to the remaining vertices in V - done to be updated. This is repeated until done contains all vertices.
-- Graph G = <V, E> -- P[i] is the best (known) path length -- from source to vertex i done := {v1} for vertex i in V-{1} P[i] := E[1,i] --direct edges end for loop |V|-1 times find closest vertex to v1 in V - done done +:= {closest} for vertex j in V - done P[j]:=min(P[j], --update knowledge on shortest paths, P[closest]+E[closest,j]) --perhaps better? end for end loop --- Dijkstra's Single-Source Shortest Paths Algorithm ---
Initially P reflects the direct edges from the source. On the first iteration of the main loop, the vertex with the shortest direct connection from the source is chosen. This is correct because any indirect path to this vertex would have to leave the source by an edge at least as long. Once a closest vertex is chosen to add to done, it may give knowledge of shorter paths from the source, via closest, to other as yet unchosen vertices.
The algorithm follows what is known as a greedy strategy. It adds vertices to done as cheaply as possible. The strategy is often a good heuristic; in this problem it also gives a correct algorithm.
As given, the algorithm calculates the lengths of the shorest paths from the source to each other vertex. If it is necessary to find the paths themselves, note that the algorithm traces a rooted tree with the source as the root. When the vector of path lengths is updated, if P(j) is reduced by the `min' then `closest' can be associated with j in another vector. This allows the paths to be recovered, in a reverse direction.
Correctness
The algorithm is correct because of a property of shortest paths: If Pk = v1, vi, ..., vj, vk, is a shortest path from v1 to vk, then Pj = v1, vi, ..., vj, must be a shortest path from v1 to vj, otherwise Pk would not be as short as possible. Also, Pj must be shorter than Pk (assuming that all edges have positive weights). So the algorithm must have found Pj on an earlier iteration than when it found Pk. i.e. Shortest paths can be found by extending earlier known shortest paths by single edges, as the algorithm does.
This property of shortest paths, and its use here, is an example of dynamic programming.
Complexity
The algorithm contains an outer loop executed |V|-1 times and inner loops, to find the closest vertex and update distances, executed O(|V|) times for each iteration of the outer loop. Its time-complexity is therefore O(|V|2).
All Pairs Shortest Paths in a Directed Graph
Floyd's algorithm calculates the costs of the shortest path between each pair of vertices in O(|V|3) time. It consists of three nested loops. The invariant of the outer loop is the key to the algorithm. At the start of an iteration, P holds the optimal path length from vi to vj, for each i and j, considering only paths that go direct or via vertices vn for n<k. This is certainly true initially when k=1 and P holds only direct paths. At each iteration the next value of k is considered. There may now be a better path possible from vi to vj via this new vk, but note that it will visit vk at most once. This means it is sufficient to consider paths from vi to vk possibly via {v1, ..., vk-1} and then on from vk to vj also possibly via {v1, ..., vk-1}. Thus the invariant is maintained. Finally P holds optimal path lengths for unrestricted paths.
-- G = <V, E> F := E; --initialise the result Graph edges F for vertex k in 1..|V| do --Invariant: -- F[v1,v2] is shortest distance from -- v1 to v2 possibly via vertices in {1..k-1}. for vertex i in 1..|V| do for vertex j in 1..|V| do F[i,j] := min(F[i,j], F[i,k]+F[k,j]) --?does k help? end for end for end for --- Floyd's all-pairs shortest paths algorithm ---
Correctness
The invariant in the code is the key to the algorithm's correctness. At the start of the kth iteration of the outer loop, F[,] contains the lengths of the shortest paths from vi to vj, possibly via intermediate vertices in the set {v1, ..., vk-1}. This is certainly true initially when no intermediate vertices are allowed. Now there is no point in visiting any vertex more than once on any shortest path (+ve edge weights). So if vk is to improve on the shortest known path from vi to vj then it can only be by going from vi to vk, possibly via vertices in {v1, ..., vk-1}, and then from vk to vj, possibly via vertices in {v1, ..., vk-1}. Such possibilities are considered in the inner `min(,)'.
Complexity
With three nested loops, Floyd's algorithm runs in O(|V|3)-time.
Running Dijkstra's single source algorithm |V| times with each vertex as the source in turn also finds all shortest paths in O(|V|3) time but Floyd's algorithm is more direct.
Connectedness and Transitive Closure
Recall that vertices vi and vj are adjacent in a graph G if there is an edge <vi,vj>. This information can be stored in a Boolean adjacency matrix A. vi and vj are said to be connected if there is a path from vi to vj. A variation on Floyd's algorithm calculates connectivity. This is Warshall's algorithm which was actually discovered first. Omitting the checks for missing edges, the central operation in Floyd's algorithm is:
F[i,j] := min( F[i,j], F[i,k]+F[k,j] )
If this is replaced by operations on a Boolean connectivity matrix C,
C[i,j] := C[i,j] or C[i,k] and C[k,j]
the resulting algorithm calculates connectivity:
--Graph G = <V, E> C := A; --C[i,j] := (<i,j> in E) where i&j in {1..|V|} --i.e. direct edges only for vertex k in 1..|V| do --Invariant: -- C[i,j] iff there is a path i--->j direct or -- via vertices in {1..k-1} only for vertex i in 1..|V| do for vertex j in 1..|V| do C[i,j] := C[i,j] or C[i,k] and C[k,j] end for end for end for --- Warshall's Algorithm ---
C is the transitive closure of A. Note that A contains paths of just one edge, A2 contains paths of exactly two edges and Am contains paths of m edges. If A1 to A|V|-1 are or-ed together we get all paths but Warshall's algorithm achieves this in just O(|V|3) time and with just one matrix C.
The notion of closure can be extended to more general operations, f and g provided that
- (i) f(x,y) = f(y,x)
- (ii) f(x,x) = x
- (iii) g(x,f(y,z)) = f(g(x,y), g(x,z)), g distributes over f
- (iv) g(x, g(y, z)) = g(g(x, y), z), g is associative
- (ii) f(x,x) = x
C[i,j] := f(C[i,j], g(C[i,k], C[k,j]))
For minimum paths <f,g>=<min,+>
and for connectivity <f,g>=<or,and>.
(Exercise: verify that these pairs <f,g> satisfy the requirements.)
Correctness
Correctness follows from the same argument used above for Floyd's algorithm.
Complexity
The time-complexity of Warshall's algorithm is O(|V|3), as for Floyd's algorithm.
Demonstration
The HTML FORM
below allows a random
edge-weighted, directed graph
with |V| vertices to be created, manipulated and analysed.
The value `pr' is the probability of there being an edge <i,j>;
it controls the sparseness of the graph
and on average there will be pr*|V|*(|V|-1) edges:
controls:
Notes
- E. W. Dijkstra. A note on two problems in connection with graphs. Numerische Mathematik 1 p269-271, 1959.
- Note that M. Thorup, Undirected single-source shortest paths with positive integer weights in linear time, JACM 46(3), pp362-394, May 1999, gives an O(|E|)-time and space algorithm for undirected graphs with integer edge weights. That algorithm is of theoretical interest only, but T' claims that there is a more practical variation that runs in O(Ackerman-1(|V|,|E|) . |E|) time, i.e. in time very nearly linear in |E|.