Solutions to final exam from Fall 2003: ======================================= 1. (a) Each pair of items collides with probability 1/m, and there are n(n-1)/2 pairs. Therefore the expected number of collisions is n(n-1)/2m. (b) n(n-1)/2m < 1 whenever m > n(n-1)/2. (c) hashes = {} for x in keys: hashx = h(x) if hashx not in hashes: hashes[hashx] = [] for y in hashes[hashx]: print x,"collides with",y hashes[hashx].append(x) The time is O(n+k). 2. The median is a leaf whenever it is inserted after both its left and right neighbors. The six permutations of these neighbors are equally likely and only two of them have the property that the median is the last one inserted. So, the probability of being a leaf is 1/3. 3. No, because a rotation near the root of the tree could change the depths of most of the tree nodes. 4. Use a layered data structure, in which the outer layer consists of a segment tree for the intervals formed by projecting each rectangle onto the x axis. In each node of the segment tree, store an interval tree of the y-axis projections of the rectangles associated with that node. Each rectangle is associated with O(log n) segment tree nodes, and contributes a constant amount to the space for each of a corresponding set of O(log n) interval trees, so the total space is O(log n). A query consists of identifying a set of O(log n) segment tree nodes containing the x coordinate of the query, and then looking up the y coordinate of the query in each of the O(log n) interval trees corresponding to these nodes. The query time is dominated by the interval tree lookups, which total O(log^2 n + k) time. 5. Suppose a find operation traverses more than one edge of its tree. Then each additional edge it traverses causes another node to be relinked to the root of its tree. Each node is relinked in this way only once, so the total number of additional edges traversed is O(m) and the total additional time spent in multi-step finds is O(m). All other operations take constant time each so the total time is O(m). This argument doesn't depend on whether union by size is used, it only needs path compression. 6. The child(x) operation replaces a single value x in the translated range minimum instance by a triple of numbers x x+1 x. Solutions to too-difficult Phase II exam from Spring 2004: ========================================================== (a) Use a planar convex hull, and binary search for the point that minimizes the linear function y-ax-b. Space = O(n), preprocessing = O(n log n), query time = O(log n). (b) Make a binary search tree (for instance, a perfectly balanced binary tree, since we don't have to worry about insertions or deletions), such that the inorder traversal of the nodes is the sorted order of the points by their x coordinates, and augment each node with the data structure from part (a) for its descendants. To find the minimum x coordinate among points with y<=ax+b, search down the binary search tree, at each step using the augmented data structure to test whether there is a point with y<=ax+b among the descendants of the left child. If so, go left, else if the root has y<=ax+b, return it, else go right. The space is O(R(n) log n), the preprocessing time is O(P(n) log n) (can be sped up to O(n log n) for convex hulls with some care, but the answer I expected was P(n) log n), and the query time is O(Q(n) log n) (can be sped up to O(log n) for convex hulls by using fractional cascading, but the answer I was expecting was O(Q(n) log n)). Solutions to homework 4: ======================== 1. S is a cyclic rotation of T if S appears as a substring of TT. So build the suffix tree of TT, then traverse the path for S. 2. There are multiple possible solutions to this, but here's one: Use a flat tree or fusion tree to translate L and R into indices into the sequence of points (xi,yi) (sorted by x coordinate), and use these indices to perform a range min query on the sequence of values yi. Space and query time are the same as that of your flat tree or fusion tree data structure, plus O(n) space and O(1) query time for the range min part. 3. Build a tree, the root of which is the highest pass. Removing this pass from the map splits the minimum spanning tree into two parts, and recursively construct trees for the left and right parts. At the leaves of the tree will be the valleys between the passes. The highest pass between any two valleys is the LCA of the two corresponding tree nodes.