Graph algorithms: Lecture notes on exponential algorithms

Throughout algorithms classes we learn that polynomial time bounds are good, exponential bad. This attitude has led to systematic avoidance of studying exponential time algorithms in theoretical CS, so it's an area where there may be many low-hanging fruit. This is also evidenced by the big gap between known worst-case bounds and experimentally measured behavior.

Why is it reasonable to study exponential algorithms?

If they're impractical, isn't it as useful as counting angels on pinheads?

Some exponentials are more impractical than others.
The better ones lead to solutions of moderate-sized real problems.
Improvements mean big differences in solvable problem sizes (typically multiplying problem size by some factor) while faster technology doesn't help so much (typically adding only a small constant to the problem size).
Polynomial time algorithmics has been criticized on the basis that if everything's fast, you don't care exactly how fast it is. This may not always be true (e.g. if server must handle many requests, speed matters) but exponential problems more likely to lead to visible runtimes where improvements can be perceived or even make the difference between solving and not solving a problem.
The alternative approach of approximation algorithms (more standard in theoretical CS) is not always suitable (approximations can be bad, e.g. for graph coloring problems; the cost of computing a true optimal answer may be made up for by the value of having that answer; or it may be a problem where approximation does not make sense).
One can sometimes make the exponential part of the time bound depend on a parameter other than problem size, which could be much smaller ("fixed parameter tractability").

Growth rates

(This is a typical freshman exercise, but let's go through it again.)

What is a typical "reasonable" problem size we can solve for a given exponential time bound?

Modern computers perform roughly 2³⁰ operations/second. So, if some algorithm takes time f(n), how big can n be to solve the problem in the given time?

Operations:	2³⁰	2³⁶	2⁴²	2⁴⁸	2⁵⁴
Time:	1 sec	1 minute	1 hour	3 days	> 6 months

f(n):	max n for given time:

1.05ⁿ	426	511	596	681	767
1.1ⁿ	218	262	305	349	392
1.2ⁿ	114	136	159	182	205
1.3ⁿ	79	95	111	127	142
1.4ⁿ	62	74	86	99	111
1.5ⁿ	51	61	72	82	92
2ⁿ	30	36	42	48	54
3ⁿ	19	23	26	30	34
n!	12	14	15	17	18
nⁿ	9	10	11	13	14
2^n²	5	6	6	7	7

Typical naive algorithms take times in the range 2ⁿ up, and can only solve smaller problems. Typical fast worst case bounds are in the range 1.2ⁿ - 1.5ⁿ. typical empirically measured time bounds are in the range 1.05ⁿ - 1.1ⁿ. So for instance we can solve certain kinds of constraint satisfaction problems exactly up to 500 variables even for the hardest examples (and examples coming from applications are often not hardest): Exponential algorithms can be practical.

Backtracking search (branch and bound)

A simple example: 3-coloring. Start with a recursive generate-and-test 3-coloring algorithm:

color(G,i) {
   if (i==n) {
    test if valid coloring
        if so, return success
        else return failure
   }
   for (each possible color c) {
       try giving v[i] color c
       color(G,i+1)
   }
   uncolor v[i]
   return failure
}

There are n levels of recursion, and the recursion branches 3 ways each level, so the time is 3ⁿ. One obvious improvement: interleave validity testing into recursion, rather than waiting until the graph is all colored before discovering some early mistake.

color(G,i) {
   if (i==n) return success
   for (each color c not already used by a neighbor of v[i]) {
       try giving v[i] color c
       color(G)
   }
   uncolor v[i]
   return failure
}

Unfortunately while this will make a practical improvement, the worst case is still 3ⁿ. The problem is that we need to choose ordering of vertices in a way that allows early termination to happen often One possibility: spanning tree preorder (e.g. depth first search numbering) then, each vertex is colored *after* its parent (except for the tree root, which has no parent, but we only need to try one of the three colors there) so we can reduce the number of branches to 2^n-1.

Changing the solution space

Instead of looking for a 3-coloring, let's look for the subset of red vertices of a 3-coloring. We can then test whether the remaining vertices can be 2-colored (i.e., whether they form a bipartite graph) in linear time. This immediately gives us a 2ⁿ algorithm (that's how many different subsets of vertices there are) without as much effort as we took above. With more thought, we can do even better: if we choose a coloring with as many red vertices as possible, the red vertices will form a maximal independent set: a set of vertices, with no edge connecting any two vertices in the set, such that every remaining vertex in the graph shares an edge with a vertex in the set. So, we can solve 3-coloring by listing all maximal independent sets and testing for each whether the complementary set is bipartite.

The algorithm below solves a slightly more general problem: given graph G, and sets Y and N, list the maximal independent sets containing everything in Y and nothing in N.

listMIS(G,Y,N)
{
    if (G = Y u N) output Y and return

    choose a vertex v

    if (v not adjacent to anything in Y)
    listMIS(G, Y u {v}, N)

    if (v isn't the final neighbor of a vertex in N)
        listMIS(G, Y, N u {v})
}

How to analyze? Obviously the time at most 2ⁿ, and one can come up with examples like in 3-coloring where a careless vertex ordering leads to at least this much time.

We'd like to do better than this. Here's an idea: each iteration reduces size of G-Y-N by only one vertex, and we want to reduce the set of undecided vertices more quickly. If we add v to Y, we can quickly remove all its neighbors (the more neighbors the better). Define degree(v)=number of neighbors in G-N-Y. Then when we add v to Y, the size of G-Y-N is reduced by 1+degree(v). If we add v to N, we don't get as big a reduction in subproblem size, but we MUST choose at least one neighbor in Y and reduce it that way (so we want all v's neighbors to have high degree)

listMIS(G,Y,N)
{
    if (G = Y u N) output Y and return
    if (some vertex in N has all neighbors in N) return without output

    choose vertex v s.t. all neighbors have degree(neighbor) >= degree(v)
    (e.g. let v = minimum degree vertex in G-Y-N)

    listMIS(G, Y u {v}, N u nbrs(v))

    let A = N
    for each neighbor w in G-Y-N {
    listMIS(G, Y u {w}, A u nbrs(w))
    A = A u {w}
    }
}

Analysis: suppose v has degree d then we get d+1 recursive calls each to a problem with at least d+1 fewer vertices (degree of neighbors could all equal d, addition of w to A could be redundant if w is also in later neighborhoods

Oversimplified analysis: suppose d is the same in all calls (it won't be) then n/(d+1) levels of recursion so time = (d+1)^n/(d+1) = ((d+1)^1/(d+1))ⁿ. The worst case is d=3, time (3^1/3)ⁿ.

Now do the analysis more carefully: analyse by counting the number of leaves of the recursion tree (total time is polynomial in this). We prove by induction that L(n) <= 3^n/3 In a call with degree(v)=d, then we get

L(n) <= (d+1) L(n - (d+1))
     <= (d+1) 3^(n - (d+1))/3  [induction hyp.]
     = (d+1)/3^(d+1)/3 3^(n/3)
     <= 3^(n/3)

Corollary: any graph has at most 3^n/3 maximal independent sets An example showing the analysis is tight: disjoint union of n/3 triangles