# Shortest Paths

The basic problem: Find the "best" way of getting from s to t where s and t are vertices in a graph. We measure "best" simply as the sum of edge lengths of a path. For instance the graph could be a map representing intersections as vertices, road segments as edges; you want to find either the shortest or fastest route from your house to ICS. Although both of these problems have different solutions, they are both shortest path problems; in one the length of an edge represents the actual mileage of a segment of road, while in the other it represents the time it would take to drive it, but in both cases the important fact is that the total length of a path is measured by adding the lengths of individual edges. For another example, which I mentioned in my first lecture on graph algorithms, the graph might have vertices representing airports, with edges representing possible flights, and the "length" of an edge measuring the cost of taking that flight; your problem would then be to find the cheapest flight from e.g. SNA to JFK. Note that these graphs may be directed; e.g. there may be a one-way road, or flights in one direction might have different costs than those the other way.

We are going to make a big assumption: that all the edges have lengths that are positive numbers. This is often but not always the case; it makes sense in the examples above, but it is conceivable that an airline could pay people to take certain routes, so that the lengths of those edges in the airport graph might be negative. We'll pretend this never happens. It makes the algorithms a lot easier. Later we'll see some special cases where we can handle negative weights.

Rather than computing one distance d(s,t), we'll compute d(s,x) for all vertices x. This is known as the single source shortest path problem (s is the source). It turns out that computing this extra information makes things easier, because then we can put together information about paths with fewer edges to get paths with more edges.

## Paths from distances

Suppose we already know the distances d(s,x) from s to every other vertices. This isn't a solution to the shortest path problem, because we want to know actual paths having those distances. How can we find those paths? That there are two kinds of shortest paths: those formed by a single edge (s,t), and those in which the path from s to t goes through some other vertices; let's say x is the last vertex the path goes through before t. Then in the second case, the overall path must be formed by concatenating a path from s to x with edge (x,t). (We can view both types of shortest path as being similar if we think of the shortest path from s to s as being one with no edges in it.) Further, the path from s to x must itself be a shortest path (since otherwise concatenating the shortest path with (x,t) would decrease the length of the overall path). A final observation is that d(s,x) must be less than d(s,t), since d(s,x)=d(s,t)+length(x,t) and we are assuming all edges have positive length.

Therefore if we only know the correct value of x we can find a shortest path:

Algorithm 1:

```    for each vertex y in sorted order by d(s,y)
let (x,y) be an edge with d(s,x)+length(x,y)=d(s,y)
path(s,y) = path(s,x) + edge (x,y)
```
We will want to use something like this idea to compute shortest paths without already knowing their lengths. When we get to y in the loop, it will still be ok to use terms like d(s,x) if this is less than d(s,y), because we will have already processed x in a previous iteration. But the pseudo-code above uses d(s,y) itself twice, and this will not work as well.

To get rid of the second use of d(s,y), in which we test to determine which edge to use, we can notice that (because we are computing a shortest path) d(s,x)+length(x,y) will be less than any similar expression, so instead of testing it for equality with d(s,y) we can just find a minimum:

Algorithm 2:

```    for each vertex y in sorted order by d(s,y)
let (x,y) be an edge with x already processed,
minimizing d(s,x)+length(x,y)
path(s,y) = path(s,x) + edge (x,y)
d(s,y) = d(s,x) + length(x,y)
```

## Dijkstra's algorithm

The only remaining use of d(s,y) in this algorithm is to determine what order to process the vertices in. Dijkstra's algorithm for shortest paths does this almost exactly like Prim's algorithm. Remember that in Prim's algorithm, we add vertices and edges one a a time to a tree, at each step choosing the shortest possible edge to add. Dijkstra's algorithm does the same thing, only choosing the edge to add at each step to be the one minimizing d(s,x)+length(x,y).

Algorithm 3: (Dijkstra, basic outline)

```    let T be a single vertex s
while (T has fewer than n vertices)
{
find edge (x,y)
with x in T and y not in T
minimizing d(s,x)+length(x,y)
d(s,y)=d(s,x)+length(x,y)
}
```

The actual shortest paths can be found by following the path in T from s to t. This defines a structure known as a "shortest path tree". In practice it may sometimes faster to build two trees, one from s and one from t, and stop when they run into each other (this usually ends up visiting less of the graph).

Just like with Prim's algorithm, we can use heaps to perform the hard part of each iteration (finding the best edge) in logarithmic time.

Algorithm 4: (Dijkstra with heaps)

```    make a heap of values (vertex,edge,distance)
initially (v,-,infinity) for each vertex
let tree T be empty
while (T has fewer than n vertices)
{
let (v,e,d(v)) have the smallest weight in the heap
remove (v,e,d(v)) from the heap
add v and e to T
set distance(s,v) to d(v)
for each edge f=(v,u)
if u is not already in T
find value (u,g,d(u)) in heap
if d(v)+length(f) < d(g)
replace (u,g,d(g)) with (u,f,d(v)+length(f))
}
```
Just as in Prim's algorithm, this runs in time O(m log n) if you use binary heaps, or O(m + n log n) if you use Fibonacci heaps.

## Dijkstra and negative lengths

Dijkstra's algorithm does not work with negative edge weights. For instance, consider the following graph (assume the edges are all directed from left to right):
```       2
A-----B
\   /
3 \ / -2
C
```
If we start with A, Dijkstra's algorithm will choose the edge (A,x) minimizing d(A,A)+length(edge), namely (A,B). It then sets d(A,B)=2 and chooses another edge (y,C) minimizing d(A,y)+d(y,C); the only choice is (A,C) and it sets d(A,C)=3. But it never finds the shortest path from A to B, via C, with total length 1.

## Topological ordering and shortest paths

There is an important class of graphs in which shortest paths can be computed more quickly, in linear time. The idea is to go back to algorithms 1 and 2, which required you to visit the vertices in some order. In those algorithms we defined the order to be sorted by distance from s, which as we have seen works for positive weight edges, but not if there are negative weights. Here's another ordering that always works: define a topological ordering of a directed graph to be one in which, whenever we have an edge from x to y, the ordering visits x before y. If we can define such an ordering, then we can do something like algorithm 2, and be sure that the predecessor of a vertex x is always processed before we process x itself.

Algorithm 5: (shortest paths from topological order)

```    for each vertex y in a topological ordering of G
choose edge (x,y) minimizing d(s,x)+length(x,y)
path(s,y) = path(s,x) + edge (x,y)
d(s,y) = d(s,x) + length(x,y)
```
This runs in linear time (with the possible exception of finding the ordering), and works even when the graph has negative length edges. You can even use it to find longest paths: just negate the lengths of all the edges. The only catch is that it only works when we can find a topological ordering.

## Topological ordering and acyclic graphs

Define a directed acyclic graph (often known as a DAG for short) to be a directed graph, containing no cycle (a cycle is a set of edges forming a loop, and all pointing the same way around the loop).

Theorem: a graph has a topological ordering if and only if it is a directed acyclic graph.

One direction of the proof is simple: suppose G is not a DAG, so it has a cycle. In any ordering of G, one vertex of the cycle has to come first, but then one of the two cycle edges at that vertex would point the wrong way for the ordering to be topological. In the other direction, we have to prove that every graph without a topological ordering contains a cycle. We'll prove this by finding an algorithm for constructing topological orderings; if the algorithm ever gets stuck we'll be able to use that information to find a cycle.

Algorithm 6: (topological ordering)

```    list L = empty
while (G is not empty)
find a vertex v with no incoming edges
delete v from G
```
If this algorithm terminates, L is a topological ordering, since we only add a vertex v when all its incoming edges have been deleted, at which point we know its predecessors are already all in the list.

What if it doesn't terminate? The only thing that could go wrong is that we could be unable to find a vertex with no incoming edges. In this case all vertices have some incoming edge. We want to prove that in this case, G has a cycle. Start with any vertex s, follow its incoming edge backwards to another vertex t, follow its incoming edge backwards again, and so on, building a chain of vertices ...w->v->u->t->s.

We can keep stepping backwards like this forever, but there's only a finite number of vertices in the graph. Therefore, we'll eventually run into a vertex we've seen before: u->w->v->u->t->s. In this case, u->w->v->u is a directed cycle. This procedure always finds a directed cycle whenever algorithm 6 gets stuck, completing the proof of the theorem that a graph has a topological ordering if and only if it is a DAG. Incidentally this also proves that algorithm 6 finds a topological ordering whenever one exists, and that we can use algorithm 6 to test whether a graph is a DAG. Putting algorithm 6 together with the "stepping backwards" procedure provides a fast method of finding cycles in graphs that are not DAGs.

Finally, let's analyze the topological ordering algorithm. The key step (finding a vertex without incoming edges) seems to require scanning the whole graph, but we can speed it up with some really simple data structures: a count I[v] of the number of edges incoming to v, and a list K of vertices without incoming edges.

Algorithm 7: (topological ordering, detailed implementation)

```    list K = empty
list L = empty
for each vertex v in G
let I[v] = number of incoming edges to v
if (I[v] = 0) add v to K
while (G is not empty)
remove a vertex v from K
for each outgoing edge (v,w)
decrement I[w]
if (I[w] = 0) add w to K