Introduction |
This programming assignment is designed to ensure that you know how to
implement two templated classes (Priority Queue and Map)
with binary trees (max-heap and Binary Search Tree respectively).
Your implementations will also include fully-functional iterators for
these classes.
For the priority queue you will be writing code that processes the underlying
array storing the values as a max-heap.
For the map you will be writing iterative and recursive code that processes the
BST.
You can test these implementations by using the standard drivers (provided with the download) and GoogleTests that we will use when grading your code for correctness; recall that you can augment the GoogleTest with whatever code you want, to aid your debugging: a GoogleTest is just a C++ program. You can also test the code you wrote for Programming Assignment #1 (which used array implementations of these classes) by substituting these binary tree implementations -typically by changing a few typedef statements. In fact, you will be required to translate my solution to the wordgenerator program to use HeapPriorityQueue and BSTMap and indicate the performance improvement (over the original priority queue and map classes) on a large text file (wghuck.txt). Write and use the standard insertion (<<) operator and str() method in each class for debugging. Note that there are few tested requirements for what these operators/methods return, but this code will make debugging easier, and we may examine it by hand. For str in HeapPriorityQueue return in a std::string the entire contents array for all indexes (ascending) along with the other instance variables: from this information you can easily reconstruct the max-heap-as-binary_tree. After enqueueing b, a, c, the HeapPriorityQueue's str() prints as heap_priority_queue[0:a,1:b,2:c,3:](length=4,used=3,mod_count=4).
For str in BSTMap return a std::string that shows the BST
rotated left by 90 degrees along with the other instance variables, so you
can easily examine the structure of the BST.
After assocating b with 2, a with 1, and
c with 3, the BSTMap's str() prints as
You should download the program3 project folder and use it to create an CLion project. You will write the required methods in the heap_priority_queue.hpp and bst_map.hpp files in this project, and submit each separately in Checkmate. The project folder also contains two pairs of .hpp and .cpp files: a driver/GoogleTest pair for each class that you will write, and the driver.cpp file which has a main function that can be made to run any of the two drivers. Important: Only one of the .cpp files with a main method can be active/compiling at any time. In the download, only the driver.cpp file is active; the GoogleTests are inactive. To make a progam inactive, select it (in the editor tab), use the Ctrl-a command to select all its lines, and then type Ctrl-/ (or click Source at the top left of the menu and choose Toggle Comment): ever line will now appear in a comment (so the main function is commented-out); by using these same instructions, you can toggle back those lines to not have comments. I recommend that you work on this assignment in pairs. Try to find someone who lives near you, with similar programming skills, and work habits/schedule: e.g., talk about whether you prefer to work mornings, nights, or weekends; what kind of commitment you will make to submit program early. Only ONE STUDENT should submit the assignment (all parts of it). If students work in pairs, BOTH NAMES and their UCInetID names must appear in a comment at the top of each submitted program. For example if Romeo Montague (whose UCInetID is romeo1) submitted a program that he worked on with his partner Juliet Capulet (whose UCInetID is jcapulet) the comment at the top of each submitted file would appear as: // Submitter: romeo1(Montague, Romeo) // Partner : jcapulet(Capulet, Juliet) // We certify that we worked cooperatively on this programming // assignment, according to the rules for pair programmingIf you do not know what the terms cooperatively and/or rules for pair programming mean, please read about Pair Programming before starting this assignment. If the names do not appear at the top of all your submissions in exactly this form, points will be deducted. If you are submitting by yourself, you may omit all lines but the first (Submitter). Please do turn in each program as you finish it, so that I can accurately assess the progress of the class as a whole during this assignment; do not turn in all the programs at the same time. Print this document and carefully read it, marking any parts that contain important detailed information that you find (for review before you turn in the files).
This assignment has 3 parts: pairs should work on each part together, not split them up and do them separately. Part 1 is worth 30 points; part 2 is worth 25 points; part 3 is worth 5 pts. Remember I'm going to be running MOSS on the parts of this assignment. Important: The courselib contains array implementations for all these data types (and you have written linked-list versions of them); although this assignment requires you to use binary trees, there are still many strong similarities at a high level in all these implementations. So, I encourage you to examine these implementations closely, and understand them; possibly, experiment with them (using their drivers or GoogleTests), while you are writing your binary tree implementations: this advice is especially true as you begin to study, understand, and implement iterators. Please feel free about asking questions about these methods: both their syntax and semantics. |
Priority Queues (30 pts) |
Priority Queues can be implemented by a variety of data structures (where the
highest priority value is always removed first).
How does a specific priority queue determine which value has the highest
priority?
We supply the priority queue with a gt (greater-than) function that
computes whether or not its first argument has a greater priority than its
second argument.
This gt can be passed to any constructor or instantiated in the
template.
So, we cannot ask, "What is the priority of a value." But, we can ask
"Does the priority of a first value have a higher priority than a second
value", by calling the gt function.
For example, we cannot determine the priority of a std::string value, but
we can determine whether one std::string value has a higher
priority than another std::string value.
We can implement priority queues efficiently using a max-heap (stored in an array), whose enqueue and dequeue operations are each O(Log N). Enqueuing/Dequeuing N values is O(N Log N). The file heap_priority_queue.hpp declares the appropriate constructors, methods, operators, and instance variables. Note the simple helper methods left_child, right_child, parent, is_root, and in_heap, which abstract the mapping functions and two useful bool functions. Use these methods to simplify writing the helper methods percolate_up and percolate_down (int i), which do most of the work in the enqueue and dequeue methods (respectively) and may be called in other places as well (e.g., the iterator's erase method). Think carefully about how to write these percolate methods simply (or you will never get them properly debugged): my method bodies used for loops and were 2 lines and 9 lines long each (each using std::swap included from <utility>). I suggest copying/pasting the methods from the array_priority_queue.hpp file, and then translating these methods from using an array to using a max-heap, since both fundamentally process arrays (e.g., see/call the ensure_length helper method). Pay close attention to ensure all instance variables receive values in the constructors and are used/set correctly in queries and commands. Special Requirements:
Iterators: There is no simple/efficient way to write an iterator for this max-heap implementation of priority queues, because the iterator is supposed to produce its values from highest to lowest priority, but the array is a max-heap (so we cannot just traverse its values); being able to erase values with an iterator makes this task even harder. Here is how to implement the iterator for this class.
Feel free to question/discuss iterators for HeapPriorityQueue on the message boards, so long as no code is posted or described in too much detail. |
Maps (25 pts) |
Maps can be implemented by a variety of data structures: they all associate
unique keys with values (possibly duplicated), by storing an
ics::pair holding both the key and its value.
The data structure should allow us quickly to find the value associated with
a key, and possibly change/update such a value.
We must supply BSTMap with a lt (less-than) function that
computes whether its first argument key belongs in the left subtree
(if not, and the two values are !=, it belongs in the right subtree).
This lt can be passed to any constructor or instantiated in the
template.
We can implement maps efficiently by using a binary search tree (BST) whose order property is determined by comparing only the key (first) part of the pair. Most operations updating BSTs are O(height of BST) which for non-pathological BSTs is O(Log N). The file bst_map.hpp declares the appropriate constructors, methods, operators, and instance variables. Note the declaration of many helper methods that operate on BSTs: most (in fact, all but find_key) are more simply implemented recursively. These recursive methods typically are passed the root of a BST (using some parameter mode), and then call themselves recursively on a left and/or right subtree (passing their left/right child as the root of a smaller tree). Many of the standard map methods call one or more of these helper methods to retrieve information from a BST or change it (e.g., add/remove key->value assocations). These include the functions find_key, has_value, copy, copy_to_queue (a helper for creating iterators), equals (a helper for operator == ), and string_rotated (a helper for str(); and the mutators insert (a helpder for put), find_addempty (a helper for []), remove_closest (a helper for remove), remove (a helper for erase), and delete_BST. I suggest examining code in the array_map.hpp file to help you understand some of the bookkeeping requirements of these methods. Many of the recursive helper functions you need to write have similar functions in the notes on processing binary trees (including BSTs) recursively. Pay close attention to ensure all instance variables receive values in the constructors and are used/set correctly in queries and commands. Special Requirements:
Iterators: Note that map iterators produce Entry which is defined by typedef ics::pair<KEY,T> Entry; There are moderately simple/efficient ways to write iterators for a BST implementation of maps, but only if we don't allow iterators to erase any key/value. Because we are allowing erasure, we will adopt a simple but time/space-inefficient iterator for this class: it has some similarities implementing iterators for the heap priority queue.
Feel free to question/discuss iterators for BSTMap on the message boards, so long as no code is posted or described in too much detail. |
wordgenerator (5 pts) |
The download contains the file wordgenerator.cpp: my solution to the
last part of Programming Assignment #1 (which also times itself running for
building the corpus and creating a priority queue of its values in order).
It also contains wghuck.txt: the text, appropriately formated, of
Mark Twain's book, "The Adventures of Huckleberry Finn".
For this part of the assignment, you must alter the #includes and
any typedefs and code in this program, so that it uses the improved
data type implementations (HeapPriorityQueue and BSTMap)
instead of the "high-complexity" (inefficient) simple array versions of these
data types.
In the comment at the bottom of this program, fill in the running times produced by the program (see the output line starting read_corpus time = and print_corpus (sort not print) time =) using both the array version and this improved version. Use an order statistic of 2. |
Testing |
There are various ways to test each of the classes you are writing in this
programming assignment.
First, though, you should write all the methods, paying careful attention to
the array implementations and previously written linked list implementations.
For some, you might just boiler-plate simple code that is not correct, but
allows the methods to compile, allowing other methods in the classes to be
tested.
For example (from the HeapPriorityQueue class)
template<class T, bool (*tgt)(const T& a, const T& b)> bool HeapPriorityQueue<T,tgt>::empty() const {return false; //On this line to indicate boiler-plate return used == 0; } template<class T, bool (*tgt)(const T& a, const T& b)> T& HeapPriorityQueue<T,tgt>::peek () const {return pq[0]; //On this line to indicate boiler-plate }Also, a constructor or void method will compile correctly with no statements in its body. Finally, don't forget to write actual code for the code you've boiler-plated. The easiest way to start testing//debugging is by using the driver program. It allows you to perform any method call supported by the templated classes, and see the state of the class (or view the debugger). Start with with the default constructor and the insertion (<<) operator and str() method, so you can use them to debug the other methods; then test whatever command puts data into the data structure; then the queries; then the remaining commands; then the operators; then the iterators; then finally the remaining constructors. After you test and debug your code with the driver, try running the appropriate GoogleTest code. Again, this form of testing is useful only as you approach a finished solution. We will use the GoogleTest, and visual inspection, to grade this assignment. Important Note: While you are debugging your classes, you can edit the GoogleTest (for example, putting std::cout statements at strategic points), but don't accidentally remove any of the assertions, otherwise you won't be fully checking your code the way we will. All debugging std::cout should end in std::endl to flush the output stream: ensure the output it displayed before executing the next statement (which may throw an exception, and thus the output buffer may not be flushed). When you run the GoogleTest, initially choose small values for the first and third prompts (just press return to the second prompt) or comment-out these prompts and assign small values to these variables instead. Besides an indication of which tests pass and fail, the console window will show a speed for the speed test (which will vary depending on how fast a machine you run your code on): don't worry about it. When your code is passing all the tests, put in values like 10,000 for these prompts.
|