Lecture notes for January 9, 1996

- Analysis can be more reliable than experimentation. If we experiment, we only know the behavior of a program on certain specific test cases, while analysis can give us guarantees about the performance on all inputs.
- It helps one choose among different solutions to problems. As we will see, there can be many different solutions to the same problem. A careful analysis and comparison can help us decide which one would be the best for our purpose, without requiring that all be implemented and tested.
- We can predict the performance of a program before we take the time to write code. In a large project, if we waited until after all the code was written to discover that something runs very slowly, it could be a major disaster, but if we do the analysis first we have time to discover speed problems and work around them.
- By analyzing an algorithm, we gain a better understanding of where the fast and slow parts are, and what to work on or work around in order to speed it up.

**The Fibonacci story:**

Leonardo of Pisa (aka
Fibonacci) was interested in many things, including a subject we
now know as *population dynamics*: For instance, how quickly
would a population of rabbits expand under appropriate
conditions?

As is typical in mathematics (and analysis of algorithms is a form of mathematics), we make the problem more abstract to get an idea of the general features without getting lost in detail:

- We assume that a pair of rabbits has a pair of children every year.
- These children are too young to have children of their own until two years later.
- Rabbits never die.

We then express the number of pairs of rabbits as a function of time (measured as a number of years since the start of the experiment):

- F(1) = 1 -- we start with one pair
- F(2) = 1 -- they're too young to have children the first year
- F(3) = 2 -- in the second year, they have a pair of children
- F(4) = 3 -- in the third year, they have another pair
- F(5) = 5 -- we get the first set of grandchildren

The algorithmic problem we'll look at today: how to compute F(n)?

So this seems to be an algorithm: compute 1.618^n - 0.618^n. Problem: how accurately do you have to know x to get the right answer? e.g. if you just use x=1.618, you get

- F(3)=1.99992 -- close enough to 2?
- F(16)=986.698 -- round to 987?
- F(18)=2583.1 -- should be 2584

**Algorithm 1:**

int fib(int n) { if (n <= 2) return 1 else return fib(n-1) + fib(n-2) }An example of the sort of basic question we study in this class is, how much time would this algorithm take? How should we measure time? The natural measure would be in seconds, but it would be nice to have an answer that didn't change every time Intel came out with a faster processor. We can measure time in terms of machine instructions; then dividing by a machine's speed (in instructions/second) would give the actual time we want. However, it is hard to guess from a piece of pseudo-code the exact number of instructions that a particular compiler would generate. To get a rough approximation of this, we try measuring in terms of lines of code.

Each call to fib returns either one or two lines. If n <= 2, we only execute one line (the if/return). if n = 3, execute 2 for fib(2), plus one each for fib(1) and fib(0): 4

It's like the rabbits! Except for the two lines in each call, the time for n is the sum of the times for two smaller recursive calls.

time(n) = 2 + time(n-1) + time(n-2)In general, any recursive algorithm such as this one gives us a

With some work, we can solve the equation, at least in terms of F(n): We think of the recursion as forming a tree. We draw one node, the root of the tree, for the first call then any time the routine calls itself, we draw another child in the tree.

F(5) / \ F(4) F(3) / \ / \ F(3) F(2) F(2) F(1) / \ F(2) F(1)The four internal nodes of this tree for fib(5) take two lines each, while the five leaves take one line, so the total number of lines executed in all the recursive calls is 13.

Note that, when we do this for any call to fib, the Fibonacci number F(i) at each internal node is just the number of leaves below that node, so the total number of leaves in the tree is just F(n). Remember that leaves count as one line of code, internal nodes 2. To count internal nodes, use basic fact about binary trees (trees in which each node has 2 children): the number of internal nodes always equals the number of leaves minus one. (You can prove this by induction: it's true if there's one leaf and no internals, and it stays true if you add 2 children to a leaf.)

So there are F(n) lines executed at the leaves, and 2F(n)-2 at the internal nodes, for a total of 3F(n)-2. Let's double check this on a simple example: time(5) = 3F(5) - 2 = 3(5)-2 = 13.

This is kind of slow e.g. for n=45 it takes over a billion steps. Maybe we can do faster?

This easy idea leads to some complicated algorithms we'll see later in the section on dynamic programming, but here it's pretty simple:

**Algorithm 2:**

int fib(int n) { int f[n+1]; f[1] = f[2] = 1; for (int i = 3; i <= n; i++) f[i] = f[i-1] + f[i-2]; return f[n]; }This is an iterative algorithm (one that uses loops instead of recursion) so we analyze it a little differently than we would a recursive algorithm. Basically, we just have to compute for each line, how many times that line is executed, by looking at which loops it's in and how many times each loop is executed.

Three lines are executed always. The first line in the loop is executed n-1 times (except for n=1) The second line in loop executed n-2 times (except for n=1) so time(n) = n-1 + n-2 + 3 = 2n (except time(1)=4).

As an example for n=45 it takes 90 steps, roughly 10 million times faster than the other program. Even if you don't do this very often this is a big enough difference to notice, so the second algorithm is much better than the first.

Again, we analyze things differently for recursive and iterative programs. For an iterative program, it's usually just a matter of looking at the variable declarations (and storage allocation calls such as malloc() in C). For instance, algorithm 2 declares only an array of n numbers. Analysis of recursive program space is more complicated: the space used at any time is the total space used by all recursive calls active at that time. Each recursive call in algorithm 1 takes a constant amount of space: some space for local variables and function arguments, but also some space for remembering where each call should return to. The calls active at any one time form a path in the tree we drew earlier, in which the argument at each node in the path is one or two units smaller than the argument at its parent. The length of any such path can be at most n, so the space needed by the recursive algorithm is again (some constant factor times) n. We abbreviate the "some constant factor times" using "O" notation: O(n).

It turns out that algorithm 2 can be modified to use a much smaller amount of space. Each step through the loop uses only the previous two values of F(n), so instead of storing these values in an array, we can simply use two variables. This requires some swapping around of values so that everything stays in the appropriate places:

**Algorithm 3:**

int fib(int n) { int a = 1, b = 1; for (int i = 3; i <= n; i++) { int c = a + b; a = b; b = c; } return b; }Here c represents f[i], b represents f[i-1], and a represents f[i-2]. The two extra assignments after the sum shift those values over in preparation for the next iteration. This algorithm uses roughly 4n lines to compute F(n), so it is slower than algorithm 2, but uses much less space.

A problem with the analysis of the two algorithms above: what is a line of code? If I use whitespace to break a line into two, it doesn't change the program speed but does change the number of lines executed. And as mentioned before, if I buy a faster computer, it does change program speed but doesn't change the analysis.

To avoid extraneous details like whitespace and computer type, we use "big O" notation. The idea: we already write the times as a function of n. Big O notation treats two functions as being roughly the same if one is c times the other where c is a constant (something that doesn't depend on n). So for instance we would replace 3F(n)-2 by O(F(n)) and both 2n and 4n by O(n).

Formally, we say that

f(n)=O(g(n))If there is some constant c such that

f(n) <= c g(n)It is true that 4n=O(n), but it is also true that n=O(4n). However note that this is not always a symmetric relation; n=O(F(n)) but it is not true that F(n)=O(n). In practice we will usually only use O notation to simplify formulas, by ignoring constant factors and other extraneous details.

What is the point of O notation? First, it makes life easier by allowing us to be less careful of all the fine details of an algorithm's behavior. But also, it allows us to compare two algorithms easily. Algorithm 2 and algorithm 3 are both O(n). According to the number of lines executed, one is twice as fast as the other, but this ratio does not change as a function of n. Other factors (like the amount of time needed to allocate the large array in algorithm 2) may mean that in actual time, the algorithms are closer to each other; a more careful analysis is needed to determine which of the two to use. On the other hand, we know that 4n is much better than 3F(n)-2 for any reasonable value of n -- but this doesn't depend on the factor of 4 in the 4n time bound, it would be the same for 7n or 12n. For larger and larger n, the ratio of n to F(n) gets very large, so that very quickly any O(n) will be faster than any O(F(n)). Replacing 4n by O(n) is an abstraction that lets us compare it to other functions without certain details (the 4) getting in the way.

Here's a mathematical trick with matrices:

[ 1 1 ] n [ F(n+1) F(n) ] [ 1 0 ] = [ F(n) F(n-1) ](You don't have to remember much linear algebra to understand this -- just the formula for multiplying two symmetric 2x2 matrices:

[ a b ] [ d e ] [ ad + be bd + ce ] [ b c ] [ e f ] = [ bd + ce be + cf ]You can then prove the result above by induction: Let

[ 1 1 ] A = [ 1 0 ]assume by induction that the equation above is is true for some n, multiply both sides by another power of A using the formula for matrix multiplication, and verify that the terms you get are the same as the formula defining the Fibonacci numbers.)

We can use this to define another iterative algorithm, using matrix multiplication. Although I will write this in C syntax, we are starting to get to pseudo-code, since C does not have matrix multiplication built in to it the way I have it written below. The following algorithm initializes a matrix M to the identity matrix (the "zeroth power" of A) and then repeatedly multiplies M by A to form the (n-1)st power. Then by the formula above, the top left corner holds F(n), the value we want to return.

**Algorithm 4:**

int fib(int n) { int M[2][2] = {{1,0},{0,1}} for (int i = 1; i < n; i++) M = M * {{1,1},{1,0}} return M[0][0]; }This takes time O(n) (so much better than algorithm 1) but is probably somewhat slower than algorithm 2 or algorithm 3. (The big O notation hides the difference between these algorithms, so you have to be more careful to tell which is better.) Like algorithm 3, this uses only O(1) space.

But we can compute M^n more quickly. The basic idea: if you want to compute e.g. 3^8 you can multiply 8 3's together one at a time (3*3*3*3*3*3*3*3) or you can repeatedly square: square 3^2 = 9, 9^2 = 3^4 = 81, 81^2 = 3^8 = 6561. The squaring idea uses many fewer multiplications, since each one doubles the exponent rather than simply adding one to it. With some care, the same idea works for matrices, and can be extended to exponents other than powers of two.

**Algorithm 5:**

int M[2][2] = {{1,0}{0,1}} int fib(int n) { matpow(n-1); return M[0][0]; } void matpow(int n) { if (n > 1) { matpow(n/2); M = M*M; } if (n is odd) M = M*{{1,1}{1,0}} }Basically all the time is in matpow, which is recursive: it tries to compute the nth power of A by squaring the (n/2)th power. However if n is odd, rounding down n/2 and squaring that power of A results in the (n-1)st power, which we "fix up" by multiplying one more factor of A.

This is a recursive algorithm, so as usual we get a recurrence relation defining time, just by writing down the time spent in a call to matpow (O(1)) plus the time in each recursive call (only one recursive call, with argument n/2). So the recurrence is

time(n) = O(1) + time(n/2)It turns out that this solves to O(log n). For the purposes of this class, we will use logarithms base 2, and round all logarithms to integers, so log n is basically the number of bits needed to write n down in binary. An equivalent way of defining it is the smallest value of i such that n < 2^i. But clearly if n < 2^i, n/2 < 2^(i-1) and conversely, so log n satisfies the recurrence log(n) = 1 + log(n/2). The recurrence defining the time for matpow is basically the same except with O(1) instead of 1. So the solution to the recurrence is just the sum of log n copies of O(1), which is O(log n).

If n is 1 billion, log n would only be 30, and this algorithm would be better than algorithms 2 and 3 in the same way that they are better than algorithm 1.

(This is actually somewhat cheating: to be able to use this for n equal to a billion you need to be able to write down the answer which will have O(n) digits, and you need to be able to store variables with that many digits. Manipulating such large numbers would take more like O(n) steps per operation, where here we are only counting one step per integer multiplication or addition. But even if you used a special library for dealing with large numbers, algorithm 4 would be much faster than the other ones.)

Actually you can get the original formula 1.618^n to work using a similar repeated squaring trick, also with time O(log n). So to tell which is better you have to be more careful and not just use O-notation -- dealing with an integer matrix is somewhat simpler than having to compute floating point square roots so it wins.

Which is the sort of comparison that analysis of algorithms is all about...

ICS 161 -- Dept.
Information & Computer Science -- UC Irvine

Last update: