Recursion In this lecture we will discuss the concept of recursion and examine recursive functions that operate on integers, strings, and lists -learning some common idioms for each. As with other topics discussed this quarter that you have already seen, I want to ensure that you have a deeper understanding of recursion. The concept of recursively defined (sometimes called inductively defined) data types and recursion is fundamental in many areas of computer science, and this concept should be discussed from many angles in many of your ICS classes; you should become comfortable with seeing and applying recursion. In addition, some programming languages (Lisp, ML, and Haskell are the foremost examples) use recursion (and also decision: if) as their primary control structures: any iterative code can be written recursively (and recursion is even more powerful than iteration, as we glimpsed in the EBNF lecture). Even languages that are not primarily recursive all support recursion (and have since the late 1960s), because sometimes using recursion is the best way to write code to solve a problem: best often means simplest, but sometimes it can mean most efficient too (efficient in time to write and/or efficient in time to run). Python (and C/C++/Java) are not primarily recursive languages. Each has strong features for iterating through data (Python has the most powerful tools for such iteration, including generator functions - which can also be recursive). But, it is important that we learn how to write recursive code in Python too. Next week, we will recursively define the linear linked list and binary search tree data structures and see how to manipulate them, both iteratively (for some functions) and recursively (for all functions). In ICS-46 we will revisit these data structures (and more) many times using C++, and again see how we can manipulate them both iteratively and recursively. Douglas Hofstadter's Pulitzer-prize winning book, "Godel, Escher, Bach" is an investigation of cognition, and commonly uses recursion and self-reference to illustrate the concepts it is discussing. It is a fascinating book and because it is old, can be purchased quite cheaply as a used book. http://www.amazon.com/G%C3%B6del-Escher-Bach-Eternal-Golden/dp/0465026567 (at least read some small reviews of this book here). I recommend it highly. ------------------------------------------------------------------------------ Recursion vs Iteration Recursion is a programming technique in which a call to a function often results in another call to that same function. In direct recursion, a call to a function appears in that function's body; in indirect/mutual recursion, the pattern is some function calls some other function ... which ultimately calls the first function. In an example where f calls g and g calls f, we say f and g are mutually recursive with f calling f indirectly via g, and g calling g indirectly via f. For some data structures (not many built-into Python) and problems, it is simpler to write recursive code than its iterative equivalent. In modern programming languages, recursive functions may run a bit slower (maybe 5%) than equivalent iterative functions, but this is not always the case (and sometimes there is no natural/simple iterative solution to a problem); in a typical application, this time difference is insignificant (most of the time could be spent elsewhere in the code anyway). When recursion is natural, it requires less programmer time to write the code. We will begin by studying the form of general recursive functions; then apply this form to functions operating on int values, and then apply this form to functions operating on strings and lists. In all these cases, we will discuss how values of these types are recursively defined and discuss the natural "sizes" of the problem solved. To start, suppose that we have the problem of collecting $1,000.00 for charity, with the assumption that when asked, everyone is willing to chip in the smallest amount of money: a penny. Iterative solution : visit 100,000 people, and ask each for a penny Recursive solution: if you are asked for a penny, give a penny to this person otherwise visit 10 people and ask them each to collect 1/10 the amount that you are asked to raise; collect the money they give you into one bag; give this bag to the person who asked you for the money In the iterative version each subproblem is the same; raising a penny. In the recursive solution, subproblems get smaller and smaller until they reach the problem of collecting a penny (they cannot get any smaller: this problem has the smallest size because there is no smaller currency). The general form of a directly recursive function is def Solve(Problem): if (Problem is minimal/not decomposable into a smaller problem: a base case) Solve Problem directly and return solution; i.e., without recursion else: (1) Decompose Problem into one or more SIMILAR, STRICTLY SMALLER subproblems: SP1, SP2, ... , SPn (2) Recursively call Solve (this function) on each SMALLER SUBPROBLEM (since they are all SIMILAR): Solve(SP1), Solve(SP2),..., Solve(SPN) (3) Combine the returned solutions to these smaller subproblems into a solution that solves the original, larger Problem (the one this original function call must solve) (4) Return the solution to the original Problem Formulate your base cases correctly. DO NOT use try/except to compensate for omitted or incorrectly written base cases. ------------------------------------------------------------------------------ Simple Recursion in Python: We will start by examining a recursive definition for the factorial function (e.g., 5! reads as "five factorial") and then a recursive function that implements it. The definition is recursive because we define how to compute a big factorial in terms of how to compute a smaller factorial. Note that the domain of the factorial function is the non-negative integers (also called the natural numbers), so 0 is the smallest number on which we can compute factorial. 0! = 1 the smallest value we can compute ! of N! = N*(N-1)! for all N>0; recursive: we define ! in terms of a smaller ! By this definition (and just substitution of equals for equals) we see that 5! = 5*4! = 5*4*3! = 5*4*3*2! = 5*4*3*2*1! = 5*4*3*2*1*0! = 5*4*3*2*1*1 We have eliminated all occurrences of !, so we can use just * to compute it: 5! = 5*4*3*2*1*1 = 120. The first definition below is a transliteration of the general code above, decomposing it into just one similar (factorial) but simpler (n-1) subproblem. def factorial (n): if n == 0: return 1 else: sp1 = n-1 # Decompose problem n into 1 subproblem solved_sp1 = factorial(sp1) # Recursive call to Solve subproblem solved_n = n*solved_sp1 # Solve problem n with solved subproblem return solved_n # Return solution The next definition is a simplification of how this function should really be written in Python, without all the intermediate names/steps, which are not needed and don't really add any clarity. def factorial (n): if n == 0: return 1 else: return n*factorial(n-1) This definition looks clean and closely mirrors the recursive mathematical description of factorial. In fact, because of the simplicity of this particular recursive function, we can write an even simpler solution using a conditional expression; but I prefer the solution above, because it is more representative of other recursive solutions (to more complicated problems). def factorial (n): return (1 if n == 0 else n*factorial(n-1)) We can contrast the recursive code with the iterative code that implements the factorial function from goody import irange def factorial (n): answer = 1; for i in irange(2,n) answer *= i return answer Note that the iterative function defines two local names (answer and i) and binds 1 to answer and rebinds it to a new value during each execution of the for loop's body. Likewise, i is rebound to a sequence of values produced when the function iterates over the irange(2,n). The recursive function defines no local names and doesn't rebind any names (although each recursive call binds an argument to the parameter in the new recursive function call). Rebinding the values of names make it hard for us to think about the meaning of code (they make it tougher to prove that the code is correct too), and makes it hard for multi-core processors to coordinate in solving a problem. "Functional programming languages" (those that allow binding of a name to computed value, but no rebinding to that names) are more amenable to be automatically parallelizable (can run more quickly on multi-core computers). You'll see more about this in later classes at UCI (e.g., Concepts of Programming Languages). We can mimic factorial's recursive definition for a function that raises a number to an integer power. Note that the domain of n for this power function again requires n to be a natural number (a non-negative integer). A**0 = 1 (yes, this is even true when A=0) A**N = A * A**(N-1) for all N>0 So A**4 = A * A**3 = A*A * A**2 = A*A*A * A**1 = A*A*A*A * A**0 = A*A*A*A*1 We can likewise translate this definition into a simple recursive Python function def power(a,n): if n == 0: return 1 else: return a*power(a,n-1) By this definition (and just substitution of equals for equals) we see that calling power(a,n) requires n multiplications. power(a,3) = a*power(a,2) = a*a*power(a,1) = a*a*a*power(a,0) = a*a*a*1 Of course we could write this code iteratively as follows, which also requires n (the same number of) multiplications def power(a,n): answer = 1 for i in irange(1,n): answer *= a return answer But there is a another way to compute power(a,n) recursively, shown below. This longer function requires between Log2 n and 2*Log2 n multiplications. Here Log2 means the log function using a base of 2. Note Log2 1000 is about 10 (2**10 = 1,024), and Log2 1,000,000 is about 20, and Log2 1,000,000,000 is about 30): so, to compute power(a,1000) requires between 10 and 20 multiplications (not the 1,000 multiplications required by the earlier definitions of power); to compute a to the billionth power would require at most 60 multiplications. The key fact we use to reduce the number of multiplications in this function is that if we need to compute a**100, and we have computed temp = a**50, all we need to compute a**100 is one more multiplication: temp*temp, because (a**50)**2 = a**100. def power(a,n): if n == 0: return 1 else: if n%2 == 1: # n is odd (if n remainder 2 == 1) return a*power(a,n-1) else: # n is even (if n remainder 2 == 0) temp = power(a,n//2) # n is divided by 2 perfectly: no remainder return temp*temp Here we use a local variable temp, but we bind temp only ONCE (we never rebind it) and then use its value, which is fine for functional programming. We could get rid of the local name temp completely by defining the local function def square(n): n*n inside power and then calling it in the else clause: return square( power(a,n//2) ). def power(a,n): def square(n): return n*n if n == 0: return 1 else: if n%2 == 1: return a*power(a,n-1) else: return square( power(a,n//2) ) For one example power(a,16) computes power(a,8) and returns its result with 1 more multiplication; power(a,8) computes power(a,4) and returns its result with 1 more multiplication; power(a,4) computes power(a,2) and returns its result with 1 more multiplication; power(a,2) computes power(a,1) and returns its result with 1 more multiplication; power(a,1) computes a*power(a,0), which requires 1 multiplication: computing power(a,0) requires 0 multiplications (it just returns the value 1). In all, power(a,16) requires just 5 multiplications, not 16. Note that this function is efficient, but it is NOT guaranteed to always use the MINIMUM number of multiplications. Power(a,15) uses 7* multiplication, but by computing x3 = x*x*x then x3*(square(square(x3))) requires only 5: see the topic named "addition-chain exponentiation" if you are interested in what is known about the minimal number of multiplications required for exponentiation. No simple algorithms solves this problem in the minimum number of multiplications. But, the algorithm above does an excellent job. *multiplications for power(a,15): 1 + multiplications for power(a,14) = 2 + multiplications for power(a, 7) = 3 + multiplications for power(a, 6) = 4 + multiplications for power(a, 3) = 5 + multiplications for power(a, 2) = 6 + multiplications for power(a, 1) = 7 + multiplications for power(a, 0) = 7 total power(a,0) requires 0. We will prove that this function computes the correct answer later in this lecture. Truth be told, we can write a fast power function like this iteratively too, but it looks much more complicated and is much more complicated to understand and analyze its behavior and prove that it is correct. ------------------------------------------------------------------------------ Hand Simulation Next, we will learn how to hand-simulate a recursive functions using a "tower of call frames" in which each resident in an apartment executes the same code (acting as the function definition) to compute a factorial: he/she is called by the resident above and calls the resident underneath, when a recursive call is needed (calling back the resident above when their answer is computed). While it is useful to be able to hand-simulate a recursive call, to better understand recursion, hand-simulation is not a good way to understand or debug recursive functions (the 3 proof rules discussed below are a better way). I will do this hand simulation on the document camera in class, using the following form for computing factorial(5). (call frames are defined/used more technically in ICS-51, and in CS 141/142) Factorial Towers +---------------------------+ | n int return value | | +------+ +------------+ | | | | | | | | +------+ +------------+ | -------+---------------------------+-------- | n int return value | | +------+ +------------+ | | | | | | | | +------+ +------------+ | +---------------------------+ | n int return value | | +------+ +------------+ | | | | | | | | +------+ +------------+ | +---------------------------+ | n int return value | | +------+ +------------+ | | | | | | | | +------+ +------------+ | +---------------------------+ | n int return value | | +------+ +------------+ | | | | | | | | +------+ +------------+ | +---------------------------+ | n int return value | | +------+ +------------+ | | | | | | | | +------+ +------------+ | +---------------------------+ .... ------------------------------------------------------------------------------ Proof Rules for Recursive Functions Now, we will learn how to VERIFY that recursive functions are correct by three proof rules. Even more important than proving that EXISTING functions are correct (to better understand them), we will use these same three proof rules to guide us when we SYNTHESIZE new recursive functions. Note that in direct recursion, we say that the function "recurs", not that it "recurses". Recurses describes what happens when you hit your thumb with a hammer the second time. I was taught that programmers who use the words "recurse" or "recurses" are not well-spoken. The three proof rules should be simple to apply in most cases. These rules mirror rules for proofs by induction in mathematics. Recursion (thinking about smaller and smaller problems) and induction (thinking about bigger and bigger problems) are two sides of the same coin. (1) Prove that the base case (smallest) problem is processed correctly: the function RECOGNIZES the base case and then RETURNS THE CORRECT RESULT without a recursive call. This proof should be easy, because base cases are small and simple to recognize; once a base case is known, its solution (for that one value) is easy to state. (2) Prove that each recursive call is on a STRICTLY SMALLER-SIZED PROBLEM: the problem gets closer to the base case. This proof should be easy because we can locate each recursive call by inspecting the function's body. Also, there are "standard" ways to recur: for ints, decrease by 1 or a factor of 10 (i.e., x//10 -using // (known as Floor or Truncating division- has one fewer digit that x); Strings, tuples, and lists recur on slices (e.g., x[1:] has fewer characters or values). (3) ASSUMING ALL RECURSIVE CALLS SOLVE THEIR SMALLER SUBPROBLEMS CORRECTLY, prove that the code combines these solved subproblems correctly, to solve the original Problem (the original parameter of the original function call). This proof should be easy, because we get to ASSUME something very important and powerful: all subproblems are solved correctly. Here is a proof, using these 3 rules, that the factorial function is correct: 1) The base case is 0; and according to the recursive mathematical definition, 0! = 1. This function recognizes an argument/parameter of 0 in the if statement and returns the correct value 1 as the result. 2) If n is a non-negative number that is not 0 (not the base case), then this function makes one recursive call: n-1 is a smaller-sized problem, closer to 0 (the base case) than n is. It is closer by 1: the distance between n-1 and 0 is 1 less than the distance between n and 0. 3) ASSUMING factorial(n-1) COMPUTES (n-1)! CORRECTLY, this function returns n*factorial(n-1), which is n*(n-1)! by our assumption, which according to the mathematical definition is the correct answer for n!, the parameter to the function call. Notice that the focus of the proof is on ONE call of the function (not like the hand simulation method above, which looked at all the recursive calls). That is, we look at what happens in two situations: the argument/parameter is a base case (rule 1 above); the argument/parameter is not the base case and the functions recurs (rule 2-3 above). For the recursive case, we don't worry about more recursive calls, because we get to ASSUME that any further recursive calls (on smaller problems, which might be the base case or at least closer to the base cases) compute the correct answer WITHOUT HAVING TO THINK about how that happens during any of the recursive calls. When proving recursive functions correct, DO NOT think about what happens when the function is called recursively later, just assume in produces the correct answer. Proof that fast-power function is correct (the code is duplicated from above): def power(a,n): def square(n): n*n if n == 0: return 1 else: if n%2 == 1: return a*power(a,n-1) else: return square( power(a,n//2) ) 1) The base case is 0; and according to the recursive mathematical definition, a**0 = 1. This function recognizes an argument of 0 if the if statement and returns the correct value 1 for it. 2) If n is a non-negative number that is not 0 (not the base case), then if n is odd, n-1 is a smaller-sized problem: closer to 0 (the base case) than n is; if n is even (it must be >= 2), n//2 is also a smaller-sized problem: closer to 0 (the base case) than n is (2//2 is 1, 4//2 is 2, 6//2 is 3, etc). For large n, n//2 is much closer to the base case than n-1, which is what gives this method its speed. 3) ASSUMING power(a,n-1) COMPUTES a**(n-1) CORRECTLY AND power(a,n//2) COMPUTES a**(n//2) CORRECTLY. We know that any n must be either odd or even: if n is odd, this function returns a*power(n-1), which is a*a**(n-1), so it returns (by simplifying) a**n, which is the correct answer for the parameters to this function; likewise, if n is even, this function returns the value square( power(a,n//2) ), which is square(a**(n//2)), which returns (by simplifying) a**n, which is the correct answer for the parameters to this function. For all even numbers n, n//2 is half that value, with no truncation: for example, for n the even number 10, square (a**(10//2)) = square (a**5) = (a**5)**2 = a**10. Again, the focus of the proof is on ONE CALL of the function: the parts concern only the base case and the recursive case (which itself is now two cases, depending on whether or not n is odd or even): and for the recursive cases, we don't worry about what happens in other recursive calls, because we get to assume that any recursive calls (on smaller problems, closer to the base cases) compute the correct answer; we DO NOT have to think about what happens during the later recursion. The Three Proof Rules are Necessary: What happens if we write factorial incorrectly? Will the proof rules fail. Yes, for any flawed definitions one will fail. Here are three examples (one failure for each proof rule). def factorial (n): if n == 0: return 0 # 0! returns 0, not 1 else: return n*factorial(n-1) This factorial function violates the first proof rule. It returns 0 for the base case; since everything is multiplied by the base case, ultimately this function always returns 0. Bar bet: you name the year and the baseball team, and I will tell you the product of all the final scores (last inning) for all the games they played that year. How do I do it and why don't I make this kind of bet on basketball teams? def factorial (n): if n == 0: return 1 else: return factorial(n+1)//(n+1) # n+1 not closer to base case: 0 This factorial function violates the second proof rule. It recurs on n+1, which is a bigger-sized problem: farther away from -not closer to- the base case. Although mathematically (n+1)!//(n+1) = (n+1)*n!//(n+1) = n! this function will continue calling factorial with ever-larger arguments: a runaway (or infinite) recursion. Actually, each recursive call takes up some space (to store its argument, see the hand simulation, which requires binding an argument for each recursive call), so eventually memory will be exhausted and Python will raise an exception. In fact, there is a recursion limit in Python (we can set it to anything we want), such that if a recursive function does more than that number of recursive calls,Python raises the RecursionError exception, with the associated string "maximum recursion depth exceeded". We can examine/set the recursion limit by importing the sys module and the calling the sys.getrecursionlimit() or sys.setrecursionlimit(some number) functions. def factorial (n): if n == 0: return 1 else: return n+factorial(n-1) # Notice the plus: n+(n-1)! is not n! This factorial function violates the third proof rule. Even if we assume that factorial(n-1) computes the correct answer, this function returns n added (not multiplied) by that value, so it does not return the correct answer. In fact, it returns one more than the sum of all the integer from 1 to n (because for 0 it returns 1) not the product of these numbers. In summary, each of these functions violates a proof rule and therefore doesn't always return the correct value. The first function always returns the wrong value; the second function returns the correct value, but only for the the base case; it never returns a value for any other argument; the third function returns the correct value, but only for the the base case. ---------- Interlude: Proving the Proof Rules Prove a Recursive Function is Correct We can actually prove that these proof rules are correct! Here is the proof. This is not simple to understand -unless you have thought a lot about recursion- but it is short so I will write the proof here and let you think about it (and reread it a dozen times, maybe later in the quarter :). The proof form (a proof by contradiction) should be familiar to ICS 6B/6D students. Assume that we have in fact proven that these three proof rules are correct for some recursive function f. And assume that we assert that the function is not correct. We will show that these two assertions lead to a contradiction. 0) If f is not correct, then there must be some problems that it does not correctly solve. And, if there are any problems that f does not solve correctly, there must be a SMALLEST problem that it does not solve correctly: call this smallest problem p. 1) Because of proof rule (1) we know that p cannot be the base case, because we have proven f recognizes and solves the base case correctly. 2) So, f must solve p by calling itself recursively. 3) Since f solves p by calling itself recursively, the recursive call(s) is/are on problem(s) SMALLER THAN p: we know this by proof rule (2) which states that f always recurs on a STRICTLY SMALLER problem size. 4) We know that f correctly solves this/these smaller problem(s), because p, BY DEFINITION, is the SMALLEST PROBLEM THAT f SOLVES INCORRECTLY. So it must solve the recursive call(s) on the smaller problem(s) correctly. 5) We also know by proof (3) that assuming f solves all problems smaller than p (which it does, because p is the SMALLEST PROBLEM F DOES NOT SOLVE CORRECTLY), then f will use these solutions of smaller problems to solve the bigger problem p correctly. So, f must solve p correctly, contradicting our assumption. Therefore, we have a contradiction. If we have proven a function correct by the three proof rules, it is impossible to find a smallest problem that f incorrectly solves; so, f must solve all problems correctly. Well, that is how the proof goes. If we assume that we have proven the 3 proof rules and that the function is incorrect, we are lead to a contradiction. So, either we haven't proven the 3 proof rules (we made some mistake in our proof) or the function is not incorrect (yes, a double negative, so it is correct). ---------- ------------------------------------------------------------------------------ Mathematics Recursively (we will skip this section; you might be interested in reading it) We can construct all the mathematical and relational operators on natural numbers (integers >= 0) given just three functions and if/recursion. We can recursively define the natural numbers as: 0 is the smallest natural number for any natural number n, s(n) (the successor of n: n+1) is a natural number Now we define three simple functions z(ero), p(redecessor), and s(uccessor). def z(n): # z(n) returns whether or not n is 0 return n == 0 def s(n): # s(n) returns the successor to n: n+1 return n+1 def p(n): # p(n) returns the predecessor of n, if one exists! if not z(n): # 0 has no predecessor return n-1 else: raise ValueError('p: cannot compute predecessor of 0') Note we should be able to prove/argue/understand the following: z(s(n)) is always False p(s(n)) is always n s(p(n)) is n if n != 0 (otherwise calling p(n) first raises an exception) Given these functions, we can define functions for all arithmetic (+ - * // **) and relational ( == <... and all the other relational) operators. For example def sum(a,b): # a+b if z(a): # a == 0 return b # return b: 0 + b = b else: # a != 0 returns sum( p(a), s(b) ) # return (a-1)+(b+1) = a+b Proof of correctness 1) The base case is a==0; and according to our knowledge of of mathematics, sum(0,b) is 0+b which is b. This function returns b when the argument a == 0. 2) If z(a) is not True (a is not 0), then p(a) as the first argument in the recursive call to sum is closer to the base case of 0. Because a is not 0, there is a predecessor of (a number one smaller than) a. 3) Assuming that sum(p(a),s(b)) computes its sum correctly, we have sum(p(a),s(b)) = sum(p(a),s(b)) = (a-1)+(b+1) = a + b, so returning this result correctly returns the sum of a and b. Another way to define this function is def sum(a,b): if z(a): # a == 0 return b # return b: 0 + b = b else: # a != 0 returns s( sum( p(a), b) ) # return (a-1)+(b) + 1 = a+b We can also use the 3 proof rule to prove this function correctly computes the sum of any two non-negative integers. Here it applies s(..smaller sum...) instead of decreasing one argument while increasing the other. Given either definition of the sum function, we can similarly define the mult function, multiplying by repeated addition. def mult(a,b): if z(a): # a = 0 return 0 # return 0: 0*b = 0 else: # a != 0 return sum(b, mult(p(a),b)) # return b+((a-1)*b) = b+a*b-b = a*b Switching from arithmetic to relational operators.... def equal(a,b): if z(a) or z(b): # a = 0 or b = 0 (either == 0) return z(a) and z(b) # return True(if both ==0), False (if one !=0) else: # a != 0 and b != 0 (neither ==0); can recur return equal(p(a),p(b)) # return a-1==b-1 which is the same as a==b def less_than(a,b): if z(a) or z(b): # a = 0 or b = 0 (either == 0) return z(a) and not z(b) # return True only when a==0 and b!=0 else: # a != 0 and b != 0 (neither ==0) can recur return less_then(p(a),p(b)) # return a-1 < b-1, same as a < b We also might find it useful to do a hand simulation of these functions, with the two parameters a and b stored in each "apartment" and passed as arguments. The right way to illustrate all this mathematics is to write a class Natural, with these methods, and then overload/define __add__ etc. for all the operators. ------------------------------------------------------------------------------ Synthesizing recursive string methods We can define strings recursively: '' is the smallest string (empty string; string whose len == 0) a character concatenated to the front of any string is a (bigger) string Let's use these proof rules to write a recipe for synthesizing (and therefore proving correct as we are writing them) a few recursive functions that process strings. Here is our approach: (1) Find the base (non-decomposable) case(s) and solve them Write code that detects the base case and returns the correct answer for it, without using recursion (2) Assume that we can decompose all non base-case problems and then solve these smaller subproblems via recursion Choose (requires some ingenuity) the decomposition; it should be "natural" (3) Write code that combines these solved subproblems (often there is just one) to solve the problem specified by the parameter We can use these rules to synthesize a method that reverses a string. We start with def reverse(s): (1) Please take time to think about the base case: the smallest string. Most students will think that a single-character string is the smallest, when in fact a zero-character string (the empty string) is smallest. It has been my experience that more students screw-up on the base case than the recursive case. Once we know the smallest string is the empty string, we need to detect it and return the correct result without recursion: the reverse of an empty string is an empty string. def reverse(s): if s == '': # or len(s) == 0 return '' # obvious reversal of empty string else: Recur to solve a smaller problem Use the solution of the smaller problem to solve the original problem: s We can guess the form of the recursion as reverse(s[1:]) noting that the slice s[1:] computes a string with all characters but the one at index 0: all the characters after the first. We are guaranteed to be slicing only non-empty strings (those whose answer is not computed by the base case), so slicing will always be a smaller string: smaller by one character. We get to assume that the recursive call correctly returns the reverse of the string that contains all characters but the first. def reverse(s): if s == '': # or len(s) == 0 return '' else: Use the solution of reverse(s[1:]) to solve the original problem: s Now, think concretely, using an example. if we called reverse('abcd') we get to assume that the recursive call works: so reverse(s[1:]) is reverse('bcd') which we get to assume returns 'dcb'). How do we use the solution of this subproblem to solve the original problem, which we know must return 'dcba'? We need to concatenate 'a' (the first character, at s[0], not included in the recursive call) to the end of the reverse of all the other characters: 'dcb' + 'a', which evaluates to 'dbca', the reverse of all the characters in the parameter string. Generally we write this function as def reverse(s): if s == '': # or len(s) == 0 return '' else return reverse(s[1:]) + s[0] We know the result has the first letter of s at the end of the reversal of all its other characters. We have now written this method by ensuring the three proof rules are satisfied so we don't have to prove them, but note that (1) the reverse of the smallest string (empty) is computed/returned correctly (2) the recursive call is on a string argument smaller than s (all the characters from index 1 to the end, skipping the character at index 0, and therefore a string with one fewer characters) (3) ASSUMING THE RECURSIVE CALL RETURNS THE CORRECT ANSWER FOR THE SMALLER STRING, then by concatenating the first character after the end of the solution to the smaller problem, we have correctly reversed the entire string (solving the problem for the original parameter). In fact, we can use a conditional expression to rewrite this code as a single line as well. def reverse(s): return ('' if s == '' else reverse(s[1:]) + s[0]) Here is a similar recursive function for reversing the values in a list. def reverse(l): if l == []: # or len(l) == 0 return []; else return reverse(l[1:]) + [l[0]] # [l[0]] for right operand of + (a list) Now we will write a recursive function that returns the string equivalent of an int using the same approach: satisfying the three proof rules. We know that Python's str function, automatically imported from the builtins module) will call int.__str__ which returns the string representation of an int. We can actually now write this function recursively ourselves, and at the same time prove it is correct. To start, we assume that the integer is non-negative (it simplifies the recursive code and we will add code to remove this assumption later). Unlike the factorial and power functions, here the size of the integer will be the number of digits it contains: the smallest non-negative integers (0-9) contain 1 digit, so that is the smallest size problem. So, we start with the header and base case. def to_str(n): if 0 <= n <= 9: # 1 digit (no 0 digit numbers) return '0123456789'[n] # 0<=n<=9, so no index error else: Recur to solve smaller problem(s) # n has at least two digits Use the solution of the smaller problem(s) to solve the original problem: n We can guess the form of the recursion as to_str(n//10) and to_str(n%10) because n//10 is all but the last digit in n, and n%10 is the last digit. If n has at least d digits (where d>=2), then both n//10 and n%10 will have fewer digits: n//10 has d-1 digits and n%10 has 1 digit. We get to assume that the recursive call correctly returns the string representation of these numbers. def to_str(n): if 0 <= n <= 9: # 1 digit (no 0 digit numbers) return '0123456789'[n] # 0<=n<=9, so no index error else: Use the solution of to_str(n//10) and to_str(n%10) to solve to_str(n) Now, think about a concrete example. if we called to_str(135) we get to assume that the recursive calls work: so to_str(n//10) is to_str(13) which we get to assume it returns '13'; and to_str(n%10) is to_str(5) which we get to assume returns '5'. How do we use the solution of these subproblems to solve the original problem? We need to concatenate them together: '13'+'5' = '135'. Generally we write this function as def to_str(n): if 0 <= n <= 9: # 1 digit (no 0 digit numbers) return '0123456789'[n] # 0<=n<=9, so no index error else: return to_str(n//10) + to_str(n%10) We have now written this method by ensuring the three proof rules are satisfied. Note that (1) the to_str of the smallest ints (1 digit) are computed/returned correctly (2) the two recursive calls are on int arguments that are smaller than n by at least one digit (in fact the second call is always exactly 1 digit). (3) ASSUMING THE RECURSIVE CALLS WORK CORRECTLY FOR THE SMALLER ints, then by concatenating the recursive calls on these two numbers together, we have correctly found the string representation of n (solving the original problem) We make this function work for negative numbers by redefining to_str with its original code in a locally defined function, changing the body of this function by either appending nothing or a '-' in front of the answer, depending on n. def to_str(n): def to_str_digits(n): # n >= 0 (see call with abs) if 0 <= n <= 9: # 1 digit (no 0 digit numbers) return '0123456789'[n] # 0<=n<=9, so no index error else: return to_str_digits(n//10) + to_str_digits(n%10) return ('' if n >= 0 else '-') + to_str_digits(abs(n)) # or #return (to_str_digits(n) if n >= 0 else '-'+to_str_digits(-n)) In fact, the following function uses the same technique (but generalizes it by converting to an arbitrary base) to compute the string representation of a number in any base from binary up to hexadecimal: to_str(11,2) returns '1011'; to_str(11,16) returns 'B'. def to_str(n,base=10): # only bases 2 - 16 allowed if 0 <= n <= base-1: return '0123456789ABCDEF'[n] # 0<=n<=15, so no index error else: return to_str(n//base,base) + to_str(n%base,base) Finally let's write a recursive method that has two parameters on which we will recur. This example is similar to the equal method applied to two integers defined in the Mathematics Recursively section above. Suppose we want to write a same_length function that tests whether the length of its two string parameters are equal, without ever explicitly computing the length of each. With two recursive parameters we have 4 possible cases: Parameter 2 Empty Not empty +----------+----------+ Empty | Equal | Not equal| Parameter 1 +----------+----------+ Not Empty | Not Equal| recur | +----------+----------+ In three of the four (anything in which one or both parameters are empty), we immediately can compute the answer without recursion. If both parameters are empty then the strings have the same length; if one parameter is empty and one isn't, then the strings have different lengths. Only if both are not empty do we need to recur to compute the correct answer. Here are four ways to write the base cases. 1) if s1 == '' and s2 == '': return True if s1 == '' and s2 != '': return False if s1 != '' and s2 == '': return False; 2) if s1 == '': return s2 == '' if s2 == '' return False # if got here, s1!='' and s2=='' 3) if s1 == '' or s2 == '': # if either is empty, ... return s1 == '' and s2 == '' # returns True if both empty 4) if s1 == '' or s2 == '': # if either is empty, ... return s1 == s2 # returns True if the same (empty) I like 3 the best; 4 is smallest but a bit harder to understand. So we can start this function as def same_length(s1,s2) if s1 == '' or s2 == '': return s1 == '' and s2 == '' else: Recur to solve a smaller problem # s1/s2 each are not empty Use the solution of the smaller problem to solve the original problem Now, if Python executes the else: clause then it has two non-empty strings, for each of which we can compute a substring (all the characters after the first). If the substrings have the same length, then the original strings have the same length (each substring is a length one smaller); if the substrings don't have the same length then the original strings don't have the same length. So, solving this problem for the substrings is exactly the same as solving it for the original strings. So we can write the recursive call as def same_length(s1,s2): if s1 == '' or s2 == '': return s1 == '' and s2 == '' else: return same_length(s1[1:],s2[1:]) Note that if we compared the lengths of a huge string and a tiny one, we would find that they are different in an amount of time proportional to the tiny string. Generally, it recurs once for each character in the smallest string. This is an example of "tail-recursion", which we will study more in the next lecture on functional programming. ------------------------------------------------------------------------------ Recursive list processing Now, here are some some simple recursive list processing functions. As with strings, we can slice a list to get a smaller list, with the slice l[1:] especially common and useful. We can define lists recursively: [] is a list a value concatenated to the front of a list is a (bigger) list If there were not len function for lists, we could easily define it recursively: def len(l): if l == []: return 0 else: return 1 + len(l[1:]) Could you start from scratch and define this as illustrated above? Likewise for a sum function def sum(l): if l == []: return 0 else: return l[0] + sum(l[1:]) Below, the all_pred function returns True if and only if predicate p always returns True (never returns False), when called on every value in the list. def all_pred(l,p): # where p is some predicate whose domain includes l's values if l == []: return True else: return p(l[0]) and all_pred(l[1:],p) Note that because and is a short-circuit operator, it recurs only as far as the first False value, at which point it does not (need to) call all_pred(l[1:],p) recursively. When we study efficiency, we will discover that the way Python represents lists (as growable arrays) make recursion very inefficient compared to iteration, but when we study linked list and trees (briefly this quarter, extensively in ICS-46) we will see that for those implementations, recursion is as fast as iteration (and the code can be much simpler to write). Finally, you might wonder why the base case, all_pred([],p) returns True. What should the function return for an empty list? Well, imagine we are one call above the empty list: a list with one value. What should all_pred([a],p) return. Well, it should return p(a) (True if p(a) is) True for this one-element list; False if it is False). What does the recursive part of this function return: p(a) and all_pred([],p). So we need to solve the equation by determine what all_pred([],p) should be. When does p(a) == p(a) and all_pred([],p) To solve this equation, and determine the value of all_pred([],p), we find that all_pred([],p) must be True: if it were False, p(a) and all_pred([],p) would be the same as p(a) and False, which would always be False, not the required answer of p(a). But p(a) and True has the same truth value as p(a). Based on this same logic, here are what based cases must be, categorized by the operator before the recursive call. base case = True ... for and recursive-call (as we saw in all) base case = False ... for or recursive-call base case = 0 ... for + recursive-call base case = 1 ... for * recursive-call (as we saw in ! and **) base case = -infinity ... for max(...,recursive_call) base case = +infinity ... for min(...,recursive_call) Generally, recursive-call(base case) must be the identity for the operator used. Finally, we can use recursion to sum all the values yielded by an iterator as follows. The only way to test if an iterator for being a base case (having no values left in it) is to try to call next on it to see if a value is produced or if StopIteration is raised. In the code below, a base case is detected by a call to next in a try block, raising the StopIteration exception; the except block handles this base case by returning the correct value: 0. def sum(iterator): try: x = next(iterator) return x + sum(iterator) except StopIteration: return 0 Here we assume the argument to sum is an iterator (not an iterable: we don't call iter on the argument). So we can use this function as is to compute sum(iter(range(100))) or sum(primes(100)): the first calls iter explicitly to convert the iterable range(100) into an iterator on the range; the second has the argument primes(100) whose returned result (by calling this generator function), is an iterator. But calling sum(range(100) for this function would not be correct. Given that calling iter on something that already is an iterator returns that same iterator (in Python classes, and in code that we wrote for __iter__ in each _iter class we wrote), we could write this code as def sum(iter_x): # it_x is iter...able or an iter...ator i = iter(iter_x) # returns just itter_x if it is already an iterator try: x = next(i) return x + sum(i) except StopIteration: return 0 Now the calls sum(iter(range(100))), sum(primes(100)), and even sum(range(100)) all work correctly. ------------------------------------------------------------------------------ Proving Properties of Recursively defined functions using Mathematical Induction Not only can we prove that recursive functions compute the correct result, we can also prove relationships among recursive functions. For example, if we use the definitions. def length(l : list) -> int: if l == []: return 0 else: return 1 + length(l[1:]) def append(a : list, b : list) -> list: if a == []: return b else: return [a[0]] + append(a[1:], b) We can prove that length(append(a,b)) = length(a) + length(b); that is, the length of appending two lists is the sum of the lengths of the lists. We will prove this formula by induction, on the length of list a, and also using the definitions of these two functions. We know nothing about b, other than it is a list of some size (empty or not). We will prove the formula for the base case (list a is empty) and the inductive case (prove that if the formula holds for a list of any length, then it also works for a list whose length is one bigger). Note that in the steps below, if list l is bound to [v]+a, then l[0] = v l[1:] = a 1) Base case: a is the smallest list possible a = [] length(append(a,b)) :left side of equation length(append([],b)) :substitute [] for a length(b) :append's a==[] is true; returns b length(a) + length(b) :right side of equation length([]) + length(a) :substitute [] for a) 0 + length(b) :length's l==[] is true; returns 0 length(b) :0 + x = x So, for the base case a = [], we have proven that length(append(a,b)) = length(b) = length(a) + length(b) 2) Inductive case: Assuming length(append(a,b)) = length(a) + length(b) we must prove that length(append([v]+a,b)) = length([v]+a) + length(b) is true; here [v]+a is any list whose length is one bigger than a's. length(append([v]+a,b)) :left side of equation length([v]+append(a,b)) :append's a==[] is false; return [v]+append(a,b) 1+length(append(a,b)) :length's l==[] is false; return 1+length(append(a,b)) 1+length(a)+length(b) :by induction hypothesis/assumption for list a length([v]+a)+length(b) :right side of equation 1+length(a)+length(b) :length's l==[] test is false; return 1 + length(a) For the inductive case, assuming length(append(a,b)) = 1+length(a)+length(b) we have proven length(append([v]+a,b)) = 1+length(a)+length(b) = length([v]+a)+length(b) Program verification is an area in CS that has a long and interesting history (and is still an important area of research). While large programs are not often proven correct, some of their core algorithms are. Sometimes, failed attempts at complicated proofs of correctness have uncovered subtle errors in code: errors that don't occur often in practice, but can occur in edge-cases. Verification is important for small/mission-critical software, say software controlling the Mars rover. ------------------------------------------------------------------------------ Problems: 1. Define a recursive function named is_odd using the functions z, p, and s described in the lecture, which computes whether or not its argument is an odd value. 2. Define a recursive function named remove, which takes string and a 1-character string, and returns a string with the specified character removed: remove('afghanistanbananastand','a') returns 'fghnistnbnnstnd'. 3. Define a recursive function named replace, which takes string and two 1-character strings, and returns a string with the first specified character replaced byh the second: remove('potpourri','o','O') returns 'pOtpOurri'. 4. Define a recursive function named contains, which takes a list and a value as arguments, and returns whether or not the value appears in the list. 5. Define a recursive function named is_sorted, which takes a list as an argument, and returns whether or not the list of values is non-decreasing (each is >= to the value preceding it). 6. Define the function equals(s1,s2), which computes whether two strings are == without ever comparing more than 1-character strings. 7. Write less_than(s1,s2) which computes whether s1 < s2 (where both are strings) without ever comparing more than 1-character strings. The result should be the same as using < (the standard Python comparison). 8. Write a function named min_stamps that takes an amount as an argument and returns the minimum number of stamps that you need to make that amount. Assume inside the function you would define denominations as a list with all the stamp amounts: e.g., denominations = [1, 2, 5, 12, 16, 24]. With these denominations min_stamps(19) returns 3 (denominations 1, 2, 16 or 2, 5, 12).