Introduction |
This programming assignment is designed to ensure that you understand the Graph
data type, know how to implement it with (Maps and Sets)
and practice using it to implement an extended version of Dijkstra's
shortest-path algorithm.
You are already familiar with templeted classes implementing the Map and Set data types; in this assignment you will implement HashGraph using your implementations of HashOpenMap and HashOpenSet (or my solutions from Programming Assignment #4). Included for the first time are methods that store a graph's information into a text file, and load (build) a graph from a text file in this format. This assignment, unlike the previous three assignments requires implementing no new complicated data structure and no new iterator: like the Equivalence class on Quiz #7, you will write it using fast data structure implementations of the same standard data types that we have been using since Programming Assignment #1. You can test the HashGraph implementation by using a driver for Graphs (provided with the download) and GoogleTest that we will use when grading your code for correctness; recall that you can augment the GoogleTest with whatever code you want, to aid your debugging: a GoogleTest is just a C++ program. You can also test the HashGraph implementation when writing/debugging the extended Dijkstras algorithm. Look at the specifications below, but also look at the comments in the HashGraph class provided, which you will fill in with code. It includes a LocalInfo class that I have completely implemented, including a << method for displaying LocalInfo objects. It is an excellent idea to study this class to better understand it, and how to use it when implementing your HashGraph class. You will use it to write a << method for graphs; an example of its output is shown below (for the graph specified in the standard.txt file): note that the node maps in a graph are printed in alphabetical order, but the sets on each line have no ordering. graph g = graph[ a -> LocalInfo[ out_nodes = set[c,b,d] out_edges = set[->c(13),->b(12),->d(14)] in_nodes = set[d] in_edges = set[d(41)->]] b -> LocalInfo[ out_nodes = set[d] out_edges = set[->d(24)] in_nodes = set[a] in_edges = set[a(12)->]] c -> LocalInfo[ out_nodes = set[d] out_edges = set[->d(34)] in_nodes = set[a] in_edges = set[a(13)->]] d -> LocalInfo[ out_nodes = set[a] out_edges = set[->a(41)] in_nodes = set[b,c,a] in_edges = set[b(24)->,a(14)->,c(34)->]] e -> LocalInfo[ out_nodes = set[] out_edges = set[] in_nodes = set[] in_edges = set[]] ] You should download the program5 project folder and use it to create an Clion project (ultimately needing to connect it to both the courselib and googletest libraries). You will write the required methods in the hash_graph.hpp, heap_adujstable_hash_priority_queue.hpp, dijkstra.hpp, and dijkstra.cpp files in this project, and submit each separately in Checkmate. The project folder will also eventually contain my solution files hash_map.hpp and hash_set.hpp, after Programming Assignment #4 is due; it also contains my solution to the heap_priority_queue.hpp, but that information is duplicated and extended in the heap_adujstable_hash_priority_queue.hpp file. The project folder also contains the driver.cpp (and driver_graph.cpp and driver_heap_adjustable_hash_priority_queue.cpp) and the GoogleTest files test_graph.cpp and test_heap_adjustable_hash_priority_queue.cpp. Finally, the file standard.txt is the textual representation of a graph used in the GoogleTest (and which you can use with the driver as well); the files flightdist.txt, and flightcost.txt will be used in the Dijkstra part of the assignment. All are stored in the input files folder. The HashGraph part of the assignment is worth 30 points; the HeapAdjustableHashPriorityQueue part is worth 5 points (but must be written to write Dijkstra correctly) and Dijkstra part is worth 25 points. Important: Only one of the .cpp files with a main method can be active/compiling at any time. In the download, only the driver.cpp and its driver_graph.cpp file is active; the drivers for the priority queue and GoogleTests are inactive. To make a progam inactive, select it (in the editor tab), use the Ctrl-a command to select all its lines, and then type Ctrl-/ (or click Source at the top left of the menu and choose Toggle Comment): ever line will now appear in a comment (so the main function is commented-out); by using these same instructions, you can toggle back those lines to not have comments. This final assignment is to be done individually. Each student should submit all parts of the the assignment. Remember I'm going to be running MOSS on the parts of this assignment. Print this document and carefully read it, marking any parts that contain important detailed information that you find (for review before you turn in the files). |
The HashGraph Class |
The HashGraph template class (templated by class T) acts as a
repository for all the information (nodes and edges) that comprise a graph.
It is organized in such a way (with redundancy) that allows us quickly to
access information a variety of information that is useful in graph
algorithms.
Part of its speed (and complexity) comes from storing (and updating) the
redundant information.
We will represent each graph primarily using two maps: one with nodes as keys and one with edges as keys (each edge is a pair of std::string node names) whose associated value's type is specified in the template. With this representation, it is easy to find the value associated with an edge, find all the nodes and edges in the graph, as well as find all the nodes and edges directly connected (into and out of) some specific node
We partition the connected nodes into two sets: outgoing and incoming, by using the out_nodes and in_nodes instance variables in the LocalInfo class. Likewise, we partition the edges into two sets: outgoing and incoming, by using the out_edges and in_edges instance variables in the LocalInfo class. All these sets could be computed when necessary from the main maps, but it is more efficient to cache their values and then be able to use these values directly, without (re)computing them. The downside of caching is that it takes more space and whenever we udpate the main maps, we must also update the related LocalInfo sets too, which is where some of the complexity of this programming assignment comes from. A variety of methods return const references to the two main maps and the sets in LocalInfo. We can iterate over this information, but we should never directly change these maps and sets (even through an iterator) returned by these methods. We should mutate the Graph only through the methods it supplies, which ensure that the necessary redundant information is kept correctly consistent. Of course, we can use the iterator constructors to build other data types with this information. Here is a brief list of all the methods in HashGraph . All of these methods are documented more fully in the template class in the download.
Queries Operators |
An Adjustable PriorityQueue Class (modifying the HeapPriorityQueue class) |
The standard efficient implementation of a priority queue (pq) data type uses a
heap data structure.
It stores each pq value in an array (efficiently storing the heap binary tree)
that conforms to the the max-heap ordering property.
We implemented a priority queue in this way in Programming Assignment #3.
Once a value is enqueued into the pq, we should not mutate it, because doing so
would likely invalidate the heap ordering property: the mutated value will
not likely be stored at the correct index to keep the heap ordering property
satisified.
In an adjustable priority queue data type, such updating is allowed via the update method, which specifies as arguments an old pq value (one already in the pq) and what new pq value to replace it by (so there still is no mutation, just replacement of one value by another in the array). The update operation must
In the standard heap implementation, there is no efficient way to do step 1: finding the index of the array storing the old pq value. The heap ordering property does little good when searching for a value: if the value isn't at the root, we must still search both subtrees (although, if we are searching a max-heap, we could avoid searching below any internal node that compares "less than" the value we are searching for, because all values below it would be even smaller). In fact, in the standad heap priority queue implementation, when we had to find the index of a pq value (for erasing it, using an iterator), we did so by just linearly scanning the array for that value. Thus, in this implementation, even step 1 above would be of complexity O(N). An improved implementation of an adjustable priority queue, would still store a heap in an array (for quickly removing the maximum priority value, with complexity O(Log N)). In addition, it would store an auxillary map whose keys are pq values, and associated with each key/pq value would be the index in the heap array at which the value is stored. (so, we now have a requirements that the pq values must be unique; using this map we can efficiently, with complexity O(1), check whether or not an enqueued value is unique.) Thus, we would do step 1 by using the map to locate the index of the old value in the array storing the heap. Using a map implemented by a hash table, this operation is of complexity O(1). We can then efficiently perform steps 2 and 3: storing the new pq value in the array at the same index, with complexity O(1); percolating the new value upward or downward to restore the heap ordering property in the array, with complexity O(Log N). But, as we percolate values in the array, we must also remember to update the map that we use to remember the index of every value stored in the array, because the indexes of the values are changing. Each swap (there are at most O(Log N) swaps), must update the map using an operation with complexity O(1). So the resulting complexity for the entire update operation is O(1)+O(1)+O(1)*O(Log N) = O(Log N). The download includes the file heap_adjustable_hash_priority_queue.hpp which updates my original solution, heap_priority_queue.hpp. It already includes...
Update the methods in the heap_adjustable_hash_priority_queue.hpp file as follows (all updates must preserve the invariant). Note that each method to update includes // update: as a comment.
|
Extended Dijkstra's Algorithm Implementing a Graph Algorithm with the HashGraph class |
In this part of the programming assignment, you will implement an extended
version of Dijkstra's Algorithm, using the HashGraph class -as well
as various other templated classes.
This algorithm reads a graph whose edges store integers (representing
non-negative costs), and then prompts the user for a start node.
It then computes a map that contains the minimum cost (shortest) paths to all
reachable nodes from that start node, and prints this map.
Finally, the user is prompted for any number of stop nodes, and prints the
minimum cost to reach that node and the list the nodes on this path.
The project folder contains the files dijkstra.hpp and dijkstra.cpp. The dijkstra.hpp provides an Info class and requires you to write two methods: one to compute shortest paths; one to reconstruct the nodes on a shortest path. The dijkstra.cpp contains the main function (and any other helper functions that you want to write) which performs this task, calling the methods in dijkstra.hpp as needed. See the lecture notes, which summarize the extended Dijkstra algorithm: translate it into C++.
Copy the form of the following interaction (the output of the graph is elided
in a few places for formatting purposes; the 3 lines of the map shown after
the start node is actually one long line).
|
Testing |
The easiest way to start testing/debugging the HashGraph class is by using its driver program. It allows you to perform any method call supported by the templated classes, and see the state of the class (or view the debugger). Here is reasonable order to write and test the code: unlike implementing data types with data structures, these method can be written and tested more independently. Write the default constructor and the add_node and add_edge methods first. Ensure that when adding an edge, you update LocalInfo for the origin and destination nodes. When these work successfully, write the has_node, has_edge, and edge_value methods; also in_degree, out_degree, and degree. Now return to the commands that remove information from the graph: remove_edge is much simpler than remove_node, because removing a node requires removing any edge using that node, which means updating the LocalInfo for all nodes that are connected by an edge to the removed node. Then finish writing the commands: the clear, load and store methods. Next, write the copy constructor and all the remaining queries: most are just a single line of code, after the checking/throwing of an exception. Finally, write the =, ==, and != operators. After you debug your code with the driver, try running the appropriate GoogleTest code. Again, this form of testing is useful only as you approach a finished solution. We will use the GoogleTest, and visual inspection, to grade this assignment. Important Note: You can put std::cout statements in the GoogleTest code (but don't accidentally remove any of the assertions, otherwise you won't be fully checking your code the way we will) while you are debugging your classes. All debugging std::cout should end in std::endl to flush the output stream: ensure the output it displayed before executing the next statement In this GoogleTest I have commented-out a try-catch block surrounding the test; if you are having problems with code inside a test, uncomment this code and you might get some useful information. Likewise, the easiest way to start testing/debugging the HeapAdjustableHashPriorityQueue classes is by using its driver program and checking that the invariant is correct; that is easy to do in the driver because it shows both the heap array and the map showing at what index each value is stored. After you debug your code with the driver, try running the appropriate GoogleTest code (see comments above). The easiest way to start testing/debugging the Dijkstra algorithm is by embedding print statements (possibly controlled by a debugging boolean) that trace the progress of the algorithm, including printing each important data structure as the algorithm executes. |