Graph algorithms: Lecture notes on exponential algorithms

Throughout algorithms classes we learn that polynomial time bounds are good, exponential bad. This attitude has led to systematic avoidance of studying exponential time algorithms in theoretical CS, so it's an area where there may be many low-hanging fruit. This is also evidenced by the big gap between known worst-case bounds and experimentally measured behavior.

Why is it reasonable to study exponential algorithms?

If they're impractical, isn't it as useful as counting angels on pinheads?

Growth rates

(This is a typical freshman exercise, but let's go through it again.)

What is a typical "reasonable" problem size we can solve for a given exponential time bound?

Modern computers perform roughly 230 operations/second. So, if some algorithm takes time f(n), how big can n be to solve the problem in the given time?

Operations: 230 236 242 248 254
Time: 1 sec 1 minute 1 hour 3 days > 6 months

f(n): max n for given time:

1.05n 426 511 596 681 767
1.1n 218 262 305 349 392
1.2n 114 136 159 182 205
1.3n 79 95 111 127 142
1.4n 62 74 86 99 111
1.5n 51 61 72 82 92
2n 30 36 42 48 54
3n 19 23 26 30 34
n! 12 14 15 17 18
nn 9 10 11 13 14
2n2 5 6 6 7 7

Typical naive algorithms take times in the range 2n up, and can only solve smaller problems. Typical fast worst case bounds are in the range 1.2n - 1.5n. typical empirically measured time bounds are in the range 1.05n - 1.1n. So for instance we can solve certain kinds of constraint satisfaction problems exactly up to 500 variables even for the hardest examples (and examples coming from applications are often not hardest): Exponential algorithms can be practical.

Backtracking search (branch and bound)

A simple example: 3-coloring. Start with a recursive generate-and-test 3-coloring algorithm:

color(G,i) {
   if (i==n) {
    test if valid coloring
        if so, return success
        else return failure
   }
   for (each possible color c) {
       try giving v[i] color c
       color(G,i+1)
   }
   uncolor v[i]
   return failure
}

There are n levels of recursion, and the recursion branches 3 ways each level, so the time is 3n. One obvious improvement: interleave validity testing into recursion, rather than waiting until the graph is all colored before discovering some early mistake.

color(G,i) {
   if (i==n) return success
   for (each color c not already used by a neighbor of v[i]) {
       try giving v[i] color c
       color(G)
   }
   uncolor v[i]
   return failure
}

Unfortunately while this will make a practical improvement, the worst case is still 3n. The problem is that we need to choose ordering of vertices in a way that allows early termination to happen often One possibility: spanning tree preorder (e.g. depth first search numbering) then, each vertex is colored *after* its parent (except for the tree root, which has no parent, but we only need to try one of the three colors there) so we can reduce the number of branches to 2n-1.

Changing the solution space

Instead of looking for a 3-coloring, let's look for the subset of red vertices of a 3-coloring. We can then test whether the remaining vertices can be 2-colored (i.e., whether they form a bipartite graph) in linear time. This immediately gives us a 2n algorithm (that's how many different subsets of vertices there are) without as much effort as we took above. With more thought, we can do even better: if we choose a coloring with as many red vertices as possible, the red vertices will form a maximal independent set: a set of vertices, with no edge connecting any two vertices in the set, such that every remaining vertex in the graph shares an edge with a vertex in the set. So, we can solve 3-coloring by listing all maximal independent sets and testing for each whether the complementary set is bipartite.

The algorithm below solves a slightly more general problem: given graph G, and sets Y and N, list the maximal independent sets containing everything in Y and nothing in N.

listMIS(G,Y,N)
{
    if (G = Y u N) output Y and return

    choose a vertex v

    if (v not adjacent to anything in Y)
    listMIS(G, Y u {v}, N)

    if (v isn't the final neighbor of a vertex in N)
        listMIS(G, Y, N u {v})
}

How to analyze? Obviously the time at most 2n, and one can come up with examples like in 3-coloring where a careless vertex ordering leads to at least this much time.

We'd like to do better than this. Here's an idea: each iteration reduces size of G-Y-N by only one vertex, and we want to reduce the set of undecided vertices more quickly. If we add v to Y, we can quickly remove all its neighbors (the more neighbors the better). Define degree(v)=number of neighbors in G-N-Y. Then when we add v to Y, the size of G-Y-N is reduced by 1+degree(v). If we add v to N, we don't get as big a reduction in subproblem size, but we MUST choose at least one neighbor in Y and reduce it that way (so we want all v's neighbors to have high degree)

listMIS(G,Y,N)
{
    if (G = Y u N) output Y and return
    if (some vertex in N has all neighbors in N) return without output

    choose vertex v s.t. all neighbors have degree(neighbor) >= degree(v)
    (e.g. let v = minimum degree vertex in G-Y-N)

    listMIS(G, Y u {v}, N u nbrs(v))

    let A = N
    for each neighbor w in G-Y-N {
    listMIS(G, Y u {w}, A u nbrs(w))
    A = A u {w}
    }
}

Analysis: suppose v has degree d then we get d+1 recursive calls each to a problem with at least d+1 fewer vertices (degree of neighbors could all equal d, addition of w to A could be redundant if w is also in later neighborhoods

Oversimplified analysis: suppose d is the same in all calls (it won't be) then n/(d+1) levels of recursion so time = (d+1)n/(d+1) = ((d+1)1/(d+1))n. The worst case is d=3, time (31/3)n.

Now do the analysis more carefully: analyse by counting the number of leaves of the recursion tree (total time is polynomial in this). We prove by induction that L(n) <= 3n/3 In a call with degree(v)=d, then we get

L(n) <= (d+1) L(n - (d+1))
     <= (d+1) 3^(n - (d+1))/3  [induction hyp.]
     = (d+1)/3^(d+1)/3 3^(n/3)
     <= 3^(n/3)

Corollary: any graph has at most 3n/3 maximal independent sets An example showing the analysis is tight: disjoint union of n/3 triangles