### Path Problems in Directed Graphs

The weight of an edge in a directed graph is often thought of as
its *length*.
The length of a path
<v_{0}, v_{1}, ..., v_{n}>
is the sum of the lengths of all component edges
<v_{i},v_{i+1}>.
Finding the shortest paths between vertices in a graph is
an important class of problem.

### Single Source Shortest Paths in a Directed Graph

As an example of a path problem, the fire-brigade keeps a map of the city marked with the locations of especially hazardous sites, such as chemical stores. They wish to know the shortest route from the fire-station to each site. Note the "length" of a road might be either its physical length or the estimated driving time on it, which are not necessarily proportional to each other.

It turns out that it as easy to find the shortest paths from
a single source to all other vertices
as it is to find the shortest path between any two vertices.
Usually the source is taken to be v_{1}.
**Dijkstra**'s
algorithm solves this *single-source shortest paths problem*
in O(|V|^{2}) time.
It operates by enlarging the set of vertices `done' for which the
shortest paths from the source are known.
Initially done contains just the source v_{1}.
At an intermediate stage,
the vertex *not* in the set done that is closest to the source
is found and added to done.
This allows our knowledge of the shortest paths to the remaining vertices in
V - done to be updated.
This is repeated until done contains all vertices.

-- Graph G = <V, E> -- P[i] is the best (known) path length -- from source to vertex i done := {v_{1}} for vertex i in V-{1} P[i] := E[1,i] --direct edges end for loop |V|-1 times find closest vertex to v_{1}in V - done done +:= {closest} for vertex j in V - done P[j]:=min(P[j], --update knowledge on shortest paths, P[closest]+E[closest,j]) --perhaps better? end for end loop--- Dijkstra's Single-Source Shortest Paths Algorithm ---

Initially P reflects the direct edges from the source. On the first iteration of the main loop, the vertex with the shortest direct connection from the source is chosen. This is correct because any indirect path to this vertex would have to leave the source by an edge at least as long. Once a closest vertex is chosen to add to done, it may give knowledge of shorter paths from the source, via closest, to other as yet unchosen vertices.

The algorithm follows what is known as a *greedy* strategy.
It adds vertices to done as cheaply as possible.
The strategy is often a good heuristic;
in this problem it also gives a correct algorithm.

As given, the algorithm calculates the lengths of the shorest paths from the source to each other vertex. If it is necessary to find the paths themselves, note that the algorithm traces a rooted tree with the source as the root. When the vector of path lengths is updated, if P(j) is reduced by the `min' then `closest' can be associated with j in another vector. This allows the paths to be recovered, in a reverse direction.

#### Correctness

The algorithm is correct because of a property of shortest paths:
If P_{k} = v_{1}, v_{i}, ...,
v_{j}, v_{k},
is a shortest path from v_{1} to v_{k}, then
P_{j} = v_{1}, v_{i}, ..., v_{j},
must be a shortest path from v_{1} to v_{j},
otherwise P_{k} would not be as short as possible.
Also, P_{j} must be shorter than P_{k}
(assuming that all edges have positive weights).
So the algorithm must have found P_{j} on
an earlier iteration than when it found P_{k}.
i.e. Shortest paths can be found by extending earlier
known shortest paths by single edges, as the algorithm does.

This property of shortest paths, and its use here, is an example of dynamic programming.

#### Complexity

The algorithm contains an outer loop executed |V|-1 times
and inner loops, to find the closest vertex and update distances,
executed O(|V|) times for each iteration of the outer loop.
Its time-complexity is therefore O(|V|^{2}).

### All Pairs Shortest Paths in a Directed Graph

**Floyd**'s
algorithm calculates the costs of the shortest path between
each pair of vertices in O(|V|^{3}) time.
It consists of three nested loops.
The invariant of the outer loop is the key to the algorithm.
At the start of an iteration, P holds the optimal path length
from v_{i} to v_{j}, for each i and j,
considering only paths that go direct or via
vertices v_{n} for n<k.
This is certainly true initially when k=1 and P holds only direct paths.
At each iteration the next value of k is considered.
There may now be a better path possible from v_{i} to v_{j}
via this new v_{k},
but note that it will visit v_{k} at most once.
This means it is sufficient to consider paths
from v_{i} to v_{k} possibly via {v_{1}, ..., v_{k-1}}
and then on from v_{k} to v_{j}
also possibly via {v_{1}, ..., v_{k-1}}.
Thus the invariant is maintained.
Finally P holds optimal path lengths for unrestricted paths.

-- G = <V, E> F := E; --initialise the result Graph edges F for vertex k in 1..|V| do --Invariant: -- F[v1,v2] is shortest distance from -- v1 to v2 possibly via vertices in {1..k-1}. for vertex i in 1..|V| do for vertex j in 1..|V| do F[i,j] := min(F[i,j], F[i,k]+F[k,j]) --?does k help? end for end for end for--- Floyd's all-pairs shortest paths algorithm ---

#### Correctness

The invariant in the code is the key to the algorithm's correctness.
At the start of the k^{th} iteration of the outer loop,
F[,] contains the lengths of the shortest paths from
v_{i} to v_{j},
possibly via *intermediate vertices* in the set
{v_{1}, ..., v_{k-1}}.
This is certainly true initially when *no*
intermediate vertices are allowed.
Now there is no point in visiting any vertex more than *once* on
any shortest path (+ve edge weights).
So if v_{k} is to improve on the shortest known path from
v_{i} to v_{j} then it can only be by
going from v_{i} to v_{k},
possibly via vertices in
{v_{1}, ..., v_{k-1}},
and then from v_{k} to v_{j},
possibly via vertices in
{v_{1}, ..., v_{k-1}}.
Such possibilities are considered in the inner `min(,)'.

#### Complexity

With three nested loops, Floyd's algorithm runs in O(|V|^{3})-time.

Running Dijkstra's single source algorithm |V| times
with each vertex as the source in turn also
finds all shortest paths in O(|V|^{3}) time
but Floyd's algorithm is more direct.

### Connectedness and Transitive Closure

Recall that
vertices v_{i} and v_{j} are *adjacent* in a graph G
if there is an edge <v_{i},v_{j}>.
This information can be stored in a Boolean adjacency matrix A.
v_{i} and v_{j} are said to be *connected* if there is a
path from v_{i} to v_{j}.
A variation on Floyd's algorithm calculates connectivity.
This is **Warshall**'s algorithm
which was actually discovered first.
Omitting the checks for missing edges,
the central operation in Floyd's algorithm is:

F[i,j] := min( F[i,j], F[i,k]+F[k,j] )

If this is replaced by operations on a Boolean connectivity matrix C,

C[i,j] := C[i,j] or C[i,k] and C[k,j]

the resulting algorithm calculates connectivity:

--Graph G = <V, E> C := A; --C[i,j] := (<i,j> in E) where i&j in {1..|V|} --i.e. direct edges only for vertex k in 1..|V| do --Invariant: -- C[i,j] iff there is a path i--->j direct or -- via vertices in {1..k-1} only for vertex i in 1..|V| do for vertex j in 1..|V| do C[i,j] := C[i,j] or C[i,k] and C[k,j] end for end for end for--- Warshall's Algorithm ---

C is the *transitive closure* of A.
Note that
A contains paths of just one edge,
A^{2} contains paths of exactly two edges
and
A^{m} contains paths of m edges.
If A^{1} to A^{|V|-1} are
**or**-ed together we get all paths
but Warshall's algorithm achieves this in just O(|V|^{3}) time
and with just one matrix C.

The notion of closure can be extended to more general operations, f and g provided that

- (i) f(x,y) = f(y,x)
- (ii) f(x,x) = x
- (iii) g(x,f(y,z)) = f(g(x,y), g(x,z)), g distributes over f
- (iv) g(x, g(y, z)) = g(g(x, y), z), g is associative

C[i,j] := f(C[i,j], g(C[i,k], C[k,j]))

For minimum paths <f,g>=<min,+>
and for connectivity <f,g>=<or,and>.

(Exercise: verify that these pairs <f,g> satisfy the requirements.)

#### Correctness

Correctness follows from the same argument used above for Floyd's algorithm.

#### Complexity

The time-complexity of Warshall's algorithm
is O(|V|^{3}), as for Floyd's algorithm.

### Demonstration

The HTML FORM
below allows a random
*edge-weighted, directed graph*
with |V| vertices to be created, manipulated and analysed.
The value `pr' is the probability of there being an edge <i,j>;
it controls the *sparseness* of the graph
and on average there will be pr*|V|*(|V|-1) edges:

#### controls:

### Notes

- E. W. Dijkstra.
*A note on two problems in connection with graphs*. Numerische Mathematik**1**p269-271, 1959. - Note that M. Thorup,
*Undirected single-source shortest paths with positive integer weights in linear time*, JACM**46**(3), pp362-394, May 1999, gives an O(|E|)-time and space algorithm for*undirected*graphs with*integer*edge weights. That algorithm is of theoretical interest only, but T' claims that there is a more practical variation that runs in O(Ackerman^{-1}(|V|,|E|)**.**|E|) time, i.e. in time very nearly linear in |E|.