Binary (Search) Trees In this lecture we upgrade our discussion of self-referential structures from linked lists to (binary) trees, by creating TN: a class that includes a value and two references to other objects from the TN class (or None). What seems like a trivial extension turns out to be profound: like going from a 1-dimensional world to a 2-dimensional world. There are entire books written about trees (in both computer science and mathematics), but (almost) no books written solely about linked lists. Over the next two lectures we will examine a few applications for trees. We will discuss ordered binary trees (search trees) and structure trees (expression trees) and discuss various recursive functions that operate on them. Both use the same definition of the TN (tree node) class shown below class TN: # A binary tree: each TN has two children def __init__(self : "TN", value : object, left : "TN" = None, right : "TN" = None): self.value = value self.left = left self.right = right We write "TN" in the annotations above, because when defining TN we cannot use TN for an annotation (because it hasn't been completely defined yet). For the left and right parameters, we really should annotated "TN or NoneType". We will discuss many operations on trees below, written as functions. We can also define methods that implement these operations as methods in the TN class. ------------------------------------------------------------------------------ Binary Search Trees Binary search trees have structure and order property. Its STRUCTURE PROPERTY dictates that every PARENT node has 0, 1, or 2 CHILDREN nodes (called left and right; each is another binary tree (TN) or None). We draw a binary tree with its ROOT on the top, its left and right children below, and its leaves at the bottom (a LEAF is a node with 0/no children: its self.left and self.right are both None; an INTERNAL node has at least one non-None child). A binary SEARCH tree (one special kind of binary tree) also has an ORDER PROPERTY: it dictates that all values in the left subtree of any node are LESS THAN that node, and all values in the right subtree of any node are GREATER THAN that node. Typically binary search trees store unique values (and we will assume so in this lecture; if we needed to store duplicates, we could change the order property to all values in the left subtree of any node are LESS THAN OR EQUAL to that node). Structurally, binary trees are much more interesting than linked lists: structurally (ignoring values) there is only one linked list of length 4: x +---+ +---+---+ +---+---+ +---+---+ +---+---+ | --+--->| ? | --+--->| ? | --+--->| ? | --+--->| ? | / | +---+ +---+---+ +---+---+ +---+---+ +---+---+ But there are 14 different binary trees with four nodes. Here is a listing of all 14. They are arranged in two groups, such that in each group a tree and its mirror image are above/below each other. ? ? ? / \ / \ / ? ? ? ? ? / \ / \ ? ? ? ? ? ? ? / \ / \ \ ? ? ? ? ? \ / / \ ? ? ? ? ----- ? ? ? ? / / / / ? ? ? ? / / \ \ ? ? ? ? / \ / \ ? ? ? ? ? ? ? ? \ \ \ \ ? ? ? ? \ \ / / ? ? ? ? \ / \ / ? ? ? ? Given the ORDER PROPERTY, you might think that the shape of a binary search tree is UNIQUELY DETERMINED by the values that it contains. THIS IS NOT SO. For example, a binary search tree with the values 1, 2, 3, and 4 can be represented by any of these structures 3 2 4 / \ / \ / 1 4 or 1 3 or 1 \ \ \ 2 4 3 / 2 or any of the 14 structures above, with the right selection of node values. Note that for EVERY NODE in the binary search trees above (not just the ROOT), the parent is > all values in its left subtree and < all values in its right subtree. Later, when we study the add function, we will learn that the shape of a binary search tree is determined not just by the values that it contains, but also by the ORDER IN WHICH THESE VALUES WERE ADDED to the binary search tree. ------------------------------------------------------------------------------ Metrics: There is just one standard metric for linked lists: length. For trees there are two standard metrics: size and height. Size counts the number of nodes in a tree (therefore it is similar to length for linked lists). It is easy to compute size recursively, using a function similar to a recursive computation of the length of a list, but with two recursive call: one for computing the size each subtree, instead of one for computing the size of the list afterwards. def size(atree): if atree == None: return 0 else: return 1 + size(atree.left) + size(atree.right) There is no simple way to compute size with a loop: for every node in the tree we must visit both its left and right subtrees, so every time that we go to the left subtree, we must also save/remember the right subtree for future exploration too; we can write this function iteratively by using an extra list of nodes, but the code is not simple to write nor easy to understand. I suggest that you can hand simulate it to understand how the nodes are all counted. def size_i(atree) nodes = [] size = 0 nodes.append(atree) while len(nodes) > 0: next = nodes.pop(0) if next != None: size += 1 nodes.append(next.left) nodes.append(next.right) return size The second metric for trees is height. In fact, we can apply height to any node in the tree. The standard definition of the height of a node is a bit strange: it is the number of steps needed to get from that node to the deepest leaf in either of the node's subtrees. So the height of a leaf (the base case) is 0 and the height of a tree is the height of its root node. We can directly translate this definition into the following code. Again there are (at most) two recursive calls, in the case of a node with two non-None children. def height(atree) if atree.left == None and atree.right == None: # leaf check as base case return 0 elif atree.right == None: # only a left subtree return 1 + height(atree.left) # recur only to left elif atree.left == None: # only a right subtree return 1 + height(atree.right) # recur only to right else: # both a left/right subtree return 1 + max(height(atree.left),height(atree.right)) # recur on both This function deals with all the necessary cases: a leaf node, an internal node with only a left (or only a right) subtree, and an internal node with both left and right subtrees. This function does not work on empty trees, which have no directly defined height, given the previous definition: the height of a NODE... (there are no nodes in an empty tree!) But, this code is much more complicated than the code for computing size. The complexity results from using a leaf node as the base case. Let us simplify this code by using an empty tree as a base case, even though it makes no sense for the standard definition of the height of a node: The number of steps needed to get from the node to the deepest leaf in either of the node's subtrees. In an empty tree, we have no node to start at and no leaf to reach. With this new definition, we will "arbitrarily" define the height of an empty tree to be -1. This might seem like a very strange approach, but it seems reasonable too: an empty tree should have a height that is one less than a leaf node (whose height is 0). By using this definition (and no others), we can simplify the height function dramatically (as well as defining it for all possible trees, even empty ones). def height(atree): if atree == None: return -1 else: return 1 + max( height(atree.left), height(atree.right) ) Mathematicians generalize definitions such as this one all the time. For any value a, a**0 is defined as 1. There are many ways to justify this definition (some quite complicated, using limits and calculus); the simplest way is to note the algebraic law a**x * a**y = a **(x+y). By this law (a quite useful one to have) a**0 * a**x = a**(0+x) = a**x; which means that a**0 must be equal to 1 for this identity to hold. If we couldn't guess that -1 was the correct answer, we could deduce it, based on what it would have to be for the recursive definition to be correct. If we started by writing the correct recursive case def height(atree): if atree == None: return empty-height # actual value of empty-height to be determined else: return 1+ max( height(atree.left), height(atree.right) ) and looked at height called on a leaf node (which we know must compute a height of 0), we would have 0 = 1 + max( height(None), height(None) ) # height(None) because it is a leaf 0 = 1 + max( empty-height, empty-height ) # height(None) returns empty-height 0 = 1 + empty-height # max(x,x) = x for all x -1 = empty-height # subtract 1 from each side The second line comes from computing the height of each base case; the third comes from simplifying that max(x,x) = x (the maximum of a value and itself is that value); the fourth line comes from subtracting 1 from each side of the equality. So, we have deduced (from the recursive call) what the base case (None) should return -1. ------------------------------------------------------------------------------ Converting between a Binary Tree and the List Representation of a Binary Tree Next we will look at functions that convert between trees and lists, showing that there is a standard way to represent a tree as a nested list of values. We represent every TN as a 3-list containing (in order) the value, left, and right subtrees (each subtree is itself a 3-list). So, we represent the tree 5 / \ 3 8 / 6 by the list [5, [3, None, None], [8, [6, None, None], None]]. Note that each list in this data structure always has exactly 3 values (empty subtrees will be only None). We could also put the value in the middle of the 3-list, which would result in [[None, 3, None], 5, [[None, 6, None], 8, None]] for the tree above. These lists can be deeply nested for tall trees. There are simple recursive functions to translate a tree argument returning a list, and a list argument returning a tree. Again, each uses two recursive calls def list_to_tree(alist : list) -> TN: if alist == None: return None else: return TN( alist[0], list_to_tree(alist[1]), list_to_tree(alist[2]) ) Each recursive call on a non-empty list builds a TN with a value (alist[0]), and then produces subtrees from the next two values in the 3-list; eventually None will be reached as base cases. Likewise, we can just as easily translate from a tree (TN) to a list. def tree_to_list(atree : TN) -> list: if atree == None: return None else: return [atree.value, tree_to_list(atree.left), tree_to_list(atree.right)] Each recursive call on a non-empty tree builds a 3-list of the value, followed by the list equivalent of the left and right subtrees; eventually None will be reached as base cases. ------------------------------------------------------------------------------ Printing a Binary Tree The following function prints a tree rotated 90 degree counter-clockwise. So the binary tree we show as 30 / \ 15 50 / \ / \ 10 25 35 70 / 20 prints as follows. Notice where the root (30) appears, and where the roots of its left (15) and right (50) subtrees appear, and the left/right roots of those subtrees, etc. ....70 ..50 ....35 30 ....25 ......20 ..15 ....10 This function declares print_tree_1, as a local helper function that does all the recursive work (using the indent_char/indent_delta parameters), and then calls print_tree_1 with an initial identation of 0 and the same atree. The helper function either does nothing (for printing an empty tree), or prints all values in its right subtree (first, with more indentation), its own value, and then all values in its left subtree (with more indentation). def print_tree(atree,indent_char =' ',indent_delta=2): def print_tree_1(indent,atree): if atree == None: return None # print nothing else: print_tree_1(indent+indent_delta, atree.right) print(indent*indent_char+str(atree.value)) print_tree_1(indent+indent_delta, atree.left) print_tree_1(0,atree) At this point, we have dealt with the structure of trees, but not their values. In a binary search trees, we can use its extra order property to search for, add a value, and remove a value efficiently: think of a tree representing a set of values (each value in a set is unique; that mirrors our intent of having unique values in binary trees). ------------------------------------------------------------------------------ Searching for a value in a Binary Search Tree We can use the following iterative function to search for a value; unlike the other functions written above, this one goes only one way (left or right) for each tree node. We know that if the value we are searching for is less than a node's value, by the order property of a binary search tree it must be in the left subtree; if the value we are searching for is greater than a node's value, it must be in the right subtree. So for a value node equal to a node's value (in which case we have already found the value) we go one way or the other. def search_i(atree,value): while atree != None and atree.value != value: # Short-circutit evaluation if value < atree.value atree = atree.left else: atree = atree.right return atree # either None or the TN storing value Note that the if statement is selecting which value (atree.left or atree.right) to store in atree, so we can simplify this if statement using a conditional expression. def search_i(atree,value): while atree != None and atree.value != value: # Short-circutit evaluation atree = (atree.left if value < atree.value else atree.right) return atree # either None or the TN storing value We can also write this function recursively. def search_r(atree,value): if atree == None: return None else: if value == atree.value: return atree elif value < atree.value: return search_r(atree.left,value) else: # value > atree.value # true by law of trichotomy: ==, <, or > return search_r(atree.right,value) We can combine the base check and equality check, and use a conditional expression to shorten this function to the following def search(atree,value): if atree == None or atree.value == value # Short-circutit evaluation return atree # atree may be empty; if not, atree.value == value else: return search( (atree.left if value < atree.value else atree.right), value) In the function above, the "base" case is an empty tree or the node storing the value; the same recursive call is executed for subtrees, with the first "smaller" tree (having fewer nodes) being either atree.left or atree.right. Because this is a tail-recursive function, we expect to be able to write it iteratively (as we did above). ------------------------------------------------------------------------------ Adding/Removing a value to/from a Binary Search Tree Now, here is a similar (to the top) function to add a value to a tree. We call it like: atree = add(atree,value) -similarly to how we added a value to a list. def add(atree,value): if atree == None: return TN(value) elif value == atree.value: return atree # already in tree; do not change the tree else: if value < atree.value: atree.left = add(atree.left,value) else: # value > atree.value: # true by law of trichotomy: ==, <, or > atree.right = add(atree.right,value) return atree In all cases, this function returns a reference to a tree to which a TN with value has been added as a subtree (returning all the values in the original tree including the new node/value). It is similar to the recursive append function for linked lists (which set alist.next = recursive call). By the 3 proof rules. 1) The code detects the base case (an empty tree) and return a tree containing only a node storing value (all the nodes in the original tree -there are none- including a node storing value). 2) Each recursive call (there are two) is on a left or right subtree (which is smaller than the entire tree, at least by one node, probably by many more if the other side contains some nodes). 3) Assume calling add returns a new BST containing all the nodes in its smaller argument BST, including a node containing value. When the value is equal to atree.value, it returns just atree (which already contains value, not duplicating that value). When the value is less/greater than atree.value, it calls add recursively, which returns the left/right BST with value included, and stores it back in atree.left/atree.right; finally it returns atree, which is a tree containing value (now either in left/right subtree of atree). Recall that the structure of a tree is not determined solely by the values it contains. As we saw above, there are many legal binary search trees storing the same values. The structure is determined by the order those values are added to the tree. Adding values in increasing order, decreasing order, at random, will all produce different shaped trees. I will defer showing the remove function, but I will describe it here and you should use this description to practice deleting values from trees (shown pictorially). Use the following simple tree for a first example 30 / \ 15 50 / \ / \ 10 25 35 70 / 20 Here are the rules: 1) To remove a value in a leaf, make its parent refer to None 2) To remove a value in a node with one child, make its parent refer to its child (this works whether the node is a left/right child of its parent, and whether its child is a left/right child) 3) To remove a value in a node with 2 children: (a) Find the biggest node less than it (or smallest node greater than it) that node must have either 0 or 1 children (can you explain why?) (b) Remove that node by rule 1 or 2 (c) Take its value and put it as the value of the node being removed So the node being removed isn't really removed (another one is): but, its value is replaced by another value, so the value is removed The first two rules are very simple. Here is an example of applying the third. If we remove the value at the root, 30, we would (a) find the node 25, (b) remove the value here by making 15's right refer to 20, (c) move the value 25 to the node that contains 30. Note the order property is preserved: all values to the left of the node that used to store 30 are less than what it now stores, 25 (25 was the biggest of the nodes < 30); all values to the right of the node that used to store 30 are greater than what it now stores, 25 (25 is < 30, so nodes > 30 are > 25). 25 / \ 15 50 / \ / \ 10 20 35 70 The binarysearchtree module contains simple recursive functions for copying a tree and determining whether two trees are equal (not only store the same values overall, but store trees that have these values in the same shape). Examine those functions, which appear below (or better yet, try to write them yourself first). def copy(atree): if atree == None: return None else: return TN(atree.value, copy(atree.left), copy(atree.right)) def equal(t1,t2): if t1 == None or t2 == None: return t1 == None and t2 == None else: return t1.value == t2.value and equal(t1.left,t2.left) and equal(t1.right,t2.right) Note that for the short-circuit "and" operator in equal, if the values in any node are not equal, the value False is returned immediately, without making the recursive calls to equal. Simple equality of BSTs means the same values in the same structure. We could generalize equality to just mean the same values regardless of structure, which would lead to a more complicated and less efficient function. But if we used BSTs to store sets, and wanted to perform set equality, we would have to use the more general definition. In that module the generator_in_order generator yields all the values (from lowest to highest) in the tree it is called on. In the next lecture we will study traversal orders more generally, discussing pre-order, in-order, post-order, and breadth-first order. We can use binary search trees easily to represent a dictionary: each TN would store a value that is 2-tuple, a key-value pair. The keys in a dictionary are known to be unique. When processing a tree, Python will always compare/process the first value in the 2-tuple (the key). In a binary search tree representation of a dictionary, all keys must be comparable; in a Python dict, we can have keys that aren't comparable: one key could be an int and another a str. So, Python dictionaries are NOT represented by binary search trees, but by something even faster and more interesting: hash tables. ICS-46 covers runtime performance (efficiency) of lists, trees, and hash tables (which is how Python stores both sets and dictionaries; hash tables are covered briefly in a later ICS-33 lecture note). A well-balanced binary search tree (all nodes having about an equal number of children in its left and right subtrees) can be searched much faster than a list or linked list. The amount of time it takes to search any binary search tree is bounded by its height: the number of comparisons in needs to go downward in the tree until it reaches the value it is searching for (or goes beyond a leaf, meaning that the value is not in the tree). The height of an N-node tree must be at least Log2 N (log base 2 of the number of nodes in the tree). The typical height, when values are added randomly, is 2-3 times that. In a linked list (or pathological binary search tree: a very deep skinny one) the number of comparisons is N. Log2 N is generally a much smaller number than N: Log2 1,000 is about 10; Log2 1,000,000 is about 20; and Log2 10^9 (a billion) is about 30. So, we could store a billion values in a reasonably well-balanced binary search tree and determine whether a value is in it using only about 60-90 comparisons. Try the following experiment, which prints the height of a tree with 1,000 values, added in a random order. values = [i for i in range(1000)] random.shuffle(values) print(height(add_all(None,values))) Log2 1,000 is about 10, so the typical height of such a tree is about 20-30, which means it takes 20-30 comparisons to find a value: much better than the average of about 500 if the values are in an unordered list or linked-list. Also, see the random_height function in the binarysearchtree.py module. Again, in ICS-46 we will look at tree processing in more depth :). ------------------------------------------------------------------------------ Expression Trees We can also use binary trees to represent mathematical formulas/expressions. In these trees, leaf nodes represent values (either literals or names bound to values), and the internal nodes represent binary operators or unary operators or unary functions (whose operands will be in the right subtree). For example, the expression (-b + sqrt(b**2 - 4*a*c))/(2*a) would be represented by the expression tree. '/' / \ + * / \ / \ - sqrt 2 a \ \ b - / \ ** * / \ / \ b 2 * c / \ 4 a Here I wrote '/' for the divide operator, since / means a left subtree in the other parts of the pictures. Actually, all values are actually stored as strings. Note that the structure of the tree determines how the subexpressions are computed. There is no need for operator precedence rules or parentheses: the structure of the tree embodies the ordering rules needed to correctly evaluate an expression: opeator nodes are computed after their operand nodes. There is an algorithm that people can follow to construct such a tree: find the last operator or function call the computer would evaluate and put that at the root of the tree; now do the same for its one/two subtrees that are subexpressions, and keep repeating finding the root of these until there are no more operators or functions (names and literals stand for themselves). In the expression above, the division between the numerator and denominator is evaluated last: on the left side the addition is evaluated last; on the right side there is only the multiplications, so that is done last. Continue this process. If we call print_tree on this tree, it would print as follows (but is hard to "read"). ....a ..* ....2 / ..........c ........* ............a ..........* ............4 ......- ..........2 ........** ..........b ....sqrt ..+ ......b ....- Once we have such a tree, we can perform many operations on it. The first and most important is evaluating the tree. We can do this recursively (evaluating subexpressions) by (1) evaluating leaves (numeric values) as themselves (2) evaluating either unary operators on their evaluated operand or unary functions on their evaluated argument (3) evaluating binary operators on their evaluated arguments The code for this method follows this outline def evaluate(etree): #name/literal as leaf node if etree.left == None and etree.right == None: return eval(str(etree.value)) #unary operator/function cal elif etree.left == None: if etree.value in {'+','-'}: #unary operator return eval(etree.value + str(evaluate(etree.right))) else: #function call: assume legal name return eval(etree.value+'('+str(evaluate(etree.right))+')') else: #binary operator: assume etree.value in {'+','-','*','/','//','**'} return eval(str(evaluate(etree.left)) + etree.value + str(evaluate(etree.right))) If we set a=1, b=2, c=1, the calcuated value is -1. We can translate this tree into infix (but overparenthesized) and postfix form: in the postfix form, each operator is proceeded by its two operands: "a + 1" (infix form) translates to "a 1 +" (postfix form). Using postfix notation (also called Polish notation because it was invented by Polish logicians right before World War II), we can write expressions unambiguously without any parentheses or knowledge of operator precedence! "(a + b) * c" translates to "a b + c *" and "a + b * c" translates to "a b c * +". Each binary operator applies to the two values before it. Here are the functions to perform these translations, and their results. def infix(etree): if etree.left == None and etree.right == None: return '('+str(etree.value)+')' elif etree.left == None: return '('+etree.value+str(infix(etree.right))+')' else: return '('+str(infix(etree.left))+etree.value+str(infix(etree.right))+')' which produces: (((-(b))+(sqrt(((b)**(2))-(((4)*(a))*(c)))))/((2)*(a))) which is correct, but over parenthesized. def postfix(etree): if etree.left == None and etree.right == None: return str(etree.value) elif etree.left == None: return str(postfix(etree.right)) + ' ' + etree.value else: return str(postfix(etree.left)) + ' ' + str(postfix(etree.right)) + ' ' + etree.value which produces: b - b 2 ** 4 a * c * - sqrt + 2 a * / If you have never seen Polish notation, this is difficult to read, but if you have studied this notation, it is easy. To understand which operators apply to which data, start on the left and circle each operand: when you get to an operator circle it and the number of operands it takes (which all come before it). Look at smaller examples: 1+2*3 is 1 2 3 * + while (1+2)*3 is 1 2 + 3 *. The operands in polish notation appear in the same order as regular notation, but the operators appear in different spots based on operator precedence and parentheses. It too requires no parentheses or knowledge of operator precedence, so some argue that it is superior to the notation we commonly use. Finally, I have defined a parse_infix function that takes a string argument and produces a tree representing the string. It is limited in the following ways: all tokens must be separated by spaces; it assumes all operators are binary, and that all operators are left-associative (which ** is not). So, it does a bit of what Python does when it processes expressions written in Python, but doesn't do everything correctly. But it does everything simply. ------------------------------------------------------------------------------ Problems 1) Draw all 14 binary search trees with the values 1, 2, 3, and 4. 2) Define a function named mirror, which takes one binary tree argument and returns a binary tree that is its mirror image: for any node, its left and right subtrees are switched (not just switched for the root, but switched for every node in the tree). 3) Define a function named sum, which takes one binary tree argument and returns the sum of all the node values. 4) Define a function named is_bst, which takes one binary tree argument and returns whether or not the tree is a binary search tree (satisfies the order property of binary search trees). It should return False for the following tree (which violates the order property): 5 / 3 \ 8 Hint: I used two helper functions: all_less and all_greater. 5) Define a function named all_satisfy, which takes two arguments: a binary tree argument and an predicate; it returns whether or not the predicate satisfies (returns True for) all values in the binary tree.