CompSci 261, Spring 2017, Homework 1

  1. Suppose that you are writing code to process html files using the Python html.parser module. This module will read lines from a file, and call functions handle_starttag and handle_endtag (that you define yourself) when it encounters html tags of different types. Both of these functions take a string argument, the type of tag. However, it will not check that each end tag equals the corresponding start tag, as it should in correct html. Describe how your handle_starttag and handle_endtag functions could use a stack data structure to perform this check. Your description should be in English, not code or pseudocode.

    For instance, your method should accept as correct the sequence of calls starttag("li"); starttag("p"); endtag("p"); endtag("li"). However, if the two endtags occur in the opposite order, or if any one of these four calls is omitted, your method should determine that there is a problem.

  2. We saw how to use a dynamic array to implement a stack, so that each push or pop operation calls a constant number of dynamic array operations. Similarly, describe how to use dynamic arrays to implement a queue so that each enqueue or dequeue operation calls a constant number of dynamic array operations. Your solution should not depend on the details of its dynamic array; in particular it should not notice when an array is resized and do anything different in that case. It should only access an array by reading and writing array elements, using operations that grow or shrink the array by one cell, or creating a new empty array.

    (Hint: maintain two dynamic arrays, one for enqueuing and one for dequeueing, and switch them from one role to the other when necessary. Note that your solution should use a constant number of dynamic array operations in the worst case, not just on average. You may assume that deleting an array (even a large one) is a single operation. For full credit, the arrays maintained by your solution should have total length $O(n)$, where $n$ is the number of elements currently in the queue.)

  3. When we analyzed dynamic arrays that expand the array by a constant factor $c>1$ whenever the array becomes full (its size equals its length), we showed using geometric sequences that the average time per grow operation is constant. This is true for $c=2$, $c=3/2$, or $c=9/8$, the constants for the textbook version of the algorithm, the Java version, and the Python version. But what if we want to determine more accurately how the time depends on $c$ rather than just saying that it's a constant? Do the same analysis more carefully to give the average time per grow operation, in $O$-notation, as a function of $c$.

    (Hint: consider the same geometric series $1+c+c^2+\dots+n+cn$, and divide its sum by the number $n$ of operations.)

  4. The Wikipedia article on the potential function method includes a demonstration that, for a binary number undergoing operations that add one to the number, the amortized number of bits that change per operation is $O(1)$. It uses a potential function $\Phi$ that counts the number of nonzero bits of the number, and observes that each operation that changes a large number of bits also decreases $\Phi$ by a correspondingly large amount, so that using amortized analysis the large decrease in $\Phi$ cancels the large number of bit-changes. Now suppose that we are instead maintaining a decimal number, under the same operations, and that we want to show that the number of digits that change per operation is $O(1)$. Provide a potential function that works for this problem, and explain why it works.