1. Let the three inputs be x, y, and z. We need to show that they lead to all possible triples of outputs, with equal probability for each output. Recall that our hash function is computed by splitting each input into bytes x0 x1 x2 x3, y0 y1 y2 y3, and z0 z1 z2 z3, and then xoring together T[0,x0]^T[1,x1]^T[2,x2]^T[3,x3] etc., where table T is filled with independent random values. First, suppose that some byte position i has three different values x_i, y_i, and z_i. Let A be a fixed assignment of the table cells other than the three cells T[i,xi], T[i,yi], and T[i,zi]. Then with this assignment, the hash value for x will be h(x)=X_A+T[i,xi], the hash value for y will be h(y)=Y_A+T[i,yi], and the hash value for z will be h(z)=Z_A+T[i,zi] for some X_A, Y_A, and Z_A that depend on A. But T[i,xi], T[i,yi] and T[i,zi] are independent, so the triple (T[i,xi],T[i,yi],T[i,zi]) is equally likely to be any value. Xoring any of the three positions in this triple with the fixed values X_A, Y_A, or Z_A just permutes the possible triples without changing their likelihoods, so the triple (h(x),h(y),h(z)) is also equally likely to be any value. That doesn't complete the analysis of this case, because we've only shown that all triples (h(x),h(y),h(z)) are equally likely for a particular fixed assignment A to the rest of the table. But if we consider all possible assignments A1, A2, A3, ... (there are very many of them!) each one leads in the same way to equal probabilities for all output triples. The probabilities for the output triples in the actual problem (where A is not fixed) is just the average of the probabilities of the same triples, averaged over the different choices A1, A2, A3, etc. But when we take the average of a set of numbers that are all equal to each other, they stay equal. So the probability of each output triple is equal to the probability of each other output triple in the original problem. Ok, now the messier case, when there isn't a single position where all three values are different. In this case in order for x, y, and z to all be different there must be two different positions where they have two different values. There are three different ways to pair which two have the same value and which one is different, but by permuting x, y, and z if necessary we can assume there is one position i for which xi=yi≠zi and a second position j for which xj≠yj=zj. Again, we can let A be a fixed assignment (that we will later average over) to all the cells of T other than T[i,zi], T[j,xj], and T[j,zj]. Then h(x)=X_A^T[j,xj], h(y)=Y_A^T[j,zi], and h(z)=Z_A^T[j,zj]^T[i,zi] for some triple of values X_A, Y_A, and Z_A that are fixed by the choice of A. But the unfixed triple (T[i,zi],T[j,xj],T[j,zj]) is equally likely to be anything, and mapping a triple (p,q,r) to another triple (X_A^p,Y_A^q,Z_A^p^r) is an invertible transformation so it permutes the possible triples without changing the probabilities. Therefore, the triple (h(x),h(y),h(z)) is also equally likely to be any possible value. As in the first case, since this is true with the fixed choice of the assignment A to the rest of the table, a simple averaging argument shows that it's also true when A is allowed to vary randomly. 2. (a) The short answer is 1/4. Here's a longer explanation for why this answer works. If only 1/4 of the cells are non-empty, then a single probe of the table looks like a false positive with probability 1/4. Five different probes would then give a false positive with probability (1/4)^5 = 1/1024 if the probe locations were independent of each other. They're not quite independent (I said in the problem that they should be chosen to be different from each other) but their dependence only improves the false positive rate. To see this, consider what might happen in the first probe -- if it sees an empty position, then we know it's not a false positive, and if it doesn't, then we know we're not going to pick that nonempty position again so the fraction of nonempty positions in the remaining cells is a little smaller than 1/4. (b) We have 1024 items and we want the table to be 1/4 full. So we need 4096 bits of memory. Each word can hold 32 bits, so we need 256 words of memory. 3. (a) 1/2. x is a leaf exactly when it is inserted after its neighbor in the sorted order. Each of these two nodes is equally likely to be inserted last. (b) 1/3. x is a leaf exactly when it is the last to be inserted among the three consecutive values with x in the middle. Each of these three values is equally likely to be last. (c) 2 x (1/2) + (n - 2) x (1/3) = (n+1)/3.