Rock and Roll Stops the Traffic

As residents of southern California, most of us face the realities of heavy traffic. If we attempt to drive on a local freeway during rush hour, we invariably experience major traffic jams and long delays. What’s worse, we are rarely apprised of unusual delays (e.g. accidents, road closures) before we’ve encountered them. Even if we know about a delay in advance, it’s often difficult to find a suitable alternate route, either because we’re unfamiliar with the area or because those alternatives are just as clogged with traffic as the route we’re trying to avoid.

It’s not hard to imagine how the presence of a wireless Internet connection in everyone’s car could help the situation. Aside from providing the obvious ability to download traffic reports and maps on demand, you could also obtain up-to-the-minute traffic information that, with a little computing, could actively aid you in finding the best way to get from one place to another, optimized not only for distance, but also for the shortest drive time the current traffic conditions allow. For example, as drivers were diverted around the scene of an accident, traffic conditions would change; the advice offered by drivers’ in-car systems would also change, optimally routing cars around a traffic jam by sending different cars down different alternate paths.

For this project, you will write a simplified version of such a system (from scratch; we provide no code for this assignment). Given a file of a computer-friendly abstraction of a map of streets and freeways, along with a snapshot of the current traffic between points on the map, your program is to find the shortest distance or fastest route from one location to another. Your program is to ask the user the name of the file, read in its information, compute what is requested, and print out the results to the console, neatly; details follow.

Our abstraction of a street map

Real-life street maps are a handy way for a human being to determine an appropriate route to take from one location to another. They present an abstraction of the world as a scaled-down drawing of the actual streets. In order to be useful to us, a street map needs to give us the names of streets and freeways, to accurately communicate distances and directions, and to show us where the various streets and freeways intersect.

For our program, we’ll need to develop a different abstraction of a street map. Our abstraction must contain the information that is pertinent to the problem we’re trying to solve, presented in a way that will make it as easy as possible for our program to solve it. Not surprisingly, a picture made up of lines and words is not an abstraction that is useful to our program; it would require a tremendous amount of effort to design and implement an algorithm to interpret the lines and words and build up an appropriate representation. It’s better that we first design the more convenient representation, then write our program to read and understand an input file that specifies it. To do so, we’ll need to consider the problem a bit further.

Our program’s main job is to discover the shortest distance or time between two locations. There’s no reason we couldn’t think of locations as being any particular point on a street map (for example, every valid street address, or even any valid GPS coordinate). For simplicity, we’ll think of them as one of two things:

The intersection of two or more streets.
A point on a freeway at which there is an entrance and/or an exit.

Connecting pairs of locations on the map are stretches of road. In order to solve our problem, we’ll need to know two things about each stretch of road:

Its length, in miles
The current speed of traffic traveling on it, in miles per hour

Our map will consist of two kinds of roads: streets and freeways. A street is a sequence of intersections, connected by stretches of road; each road segment has a length and a speed of traffic. Note going from intersection a to intersection b on a road is not necessarily the same as going from b to a, as the road length and/or speed of traffic could well be different. In fact, if the street is one way, there may not even be a road segment from b to a.

In real life, many intersections control traffic using stop signs or traffic lights. Our program will ignore them; we’ll instead assume that the traffic speeds on streets has been adjusted appropriately downward to account for the time spent waiting at stop signs and lights.

As with streets, freeways are stretches of road traveling in opposite directions. We need to treat each side of the freeway as an entirely separate road for the same reasons as streets and also because the sequence of ramps on one side of a freeway is sometimes different than the sequence on the opposite side. For example, the 405 South has a transition ramp that leads to the 73 South, but the 405 North has no such ramp.

Since streets and freeways have the same properties for the purposes of our program, we can think about streets and freeways as roads, that is, a sequence of locations connected by stretches of road.

The output of our program will be a trip, if it exists, or a message that the trip cannot be made. A trip is a sequence of visits to locations on the map froma a given starting location to a given ending location. For example, Alex tells us that, when he used to live in Costa Mesa, his trip home from UCI looked like this:

Start at Peltason & Los Trancos
Continue to Bison & Peltason
Continue to Bison & California
Continue to Bison & 73N on-ramp
Continue to 73N @ Birch
Continue to 73N @ 73N-to-55N transition
Continue to 55N @ Baker
Continue to 55N Baker/Paularino ramp & Baker
Continue to Baker & Bristol

where he started at Pelatason & Los Trancos and ended at Baker & Bristol.

In addition to the information above, our program will also output information about the distance in miles and travel time of each of the segments of the trip, as well as the overall distance and travel time for the whole trip.

A trip cannot be made if there is no route from the starting point to the ending point.

Representing our abstraction of a street map

If you consider all of the data that we’ll need to represent this abstraction, the task of organizing it can seem overwhelming. However, there is a well-known data structure that represents this system in a straightforward way: a directed graph.

Using a directed graph, vertices can represent locations, and edges can represent the stretches of road that connect them. Since traffic travels in only one direction on a stretch of road, it makes good sense that the graph be directed.

Each vertex in the graph will have a human-readable name for the location it represents. For example, a vertex might be named Culver & Harvard or it might be named I-405N @ Jamboree. The name will be used only for display purposes; it won’t have any significance to our algorithm. Instead, vertices are numbered uniquely and consecutively, starting at zero; if there are n vertices, they should be numbered 0 .. n - 1. This approach works much better than using names.

Each edge will contain the distance between the two vertices (in miles, stored as a double) and the current speed of traffic (in miles per hour, stored as a double).

A trip is a sequence of visits to adjacent locations on the map, locations are represented by vertices, and two locations are adjacent only when there is an edge connecting them; a trip is simply a path in the graph.

The input

Your program is to read its input from a file. So that we can keep everything straight during the grading process, write your main() method in a class called RockAndRoll.

The project file contains a few sample input files, to illustrate the format of input files to this program and to provide you files on which you can test your code—of course, you should test your program further by using additional legal input files.

Blank lines, or lines beginning with a # character are ignored anywhere they appear—the former are to improve readability, the latter so we can put comments in the file. Blank and comment lines can appear anywhere in the file, or not at all—so be sure your file input routine does not rely on them being in any specific location in the file.

An input file is separated into three sections: the names of the map locations, the road segments connecting them, and the trips to be analyzed.

The first (non-blank, non-comment) line of the first section contains the number of locations. If there are n locations, the next n lines of the file (again ignoring any blank and comment lines) contain the names of each location. The locations will be stored in a graph as vertices; verticies are numbered consecutively in the order in which they appear in the file, starting at 0.

The next section of the file defines the road segments. Each road segment will be an edge in the graph. The first line of this section defines the number of segments. Following that are the appropriate number of road segment definitions. Each segment is defined on a line, with four values on it:

The vertex number where the segment begins
The vertex number where the segment ends
The distance covered by the segment, in miles
The current speed of traffic on the segment, in miles per hour

Finally, the trips are defined. Again, the section begins with a line containing the number of trips. Following that are that number of trip requests. Each trip request is a line with three values on it:

The starting location for the trip
The ending location for the trip
D if the program should determine the shortest distance, T if the program should determine the shortest driving time

Your program should read the vertices and edges from the file, build the graph, then process the trip requests in the order that they appear. The content and format of the output for each trip request is described below.

You may assume that the input file will be formatted as described above and that the provided data is correct and consistent. For instance, if the file says there are 23 locations, then there will be 23 locations, trip lengths and times will be positive and the starting and ending locations for a trip will exist. However, a correct and consistent file does not imply that all possible connections between map locations actually exist. For instance, there may be a way to get from point A to point B, but no way to get back again, or no way to get from point A to point C. Your program should handle such situations correctly and gracefully—in particular, it should not blow up!

Implementing the graph

There are two main approaches that can be used to implement a graph: an adjacency matrix and an adjacency list. Sparse graphs (that is, ones with few edges relative to the number of vertices) are better implemented as adjacency lists, since an adjacency matrix would waste a great deal of memory storing the many empty cells in the matrix. Our street map is a sparse graph, since each vertex will have edges to and from only a few relatively “nearby” vertices. So, adjacency lists are a far superior approach in our case; you are required to use them to represent your graph.

A good way to store adjacency lists is to place each vertex’s information into the cell of an ArrayList that corresponds to its vertex number. Each cell has a vertex name and a reference to the first node in a list of its outgoing edges. For each of these edges, we store the number of the adjacent vertex, the distance to that vertex, and the time to travel to it. (Remember that an adjacent vertex is one that can be reached from the current vertex by following one edge.) You will obviously need methods to build up the adjacency lists and to access them to get information about a particular vertex or edge.

Finding the shortest paths

The problem we need to solve, that of finding the fastest or shortest trip along a network of roads, is an instance of the single-source, positive-weighted, shortest-path problem. In other words, from one particular vertex (a “single source”), we’ll be finding the shortest path to another vertex where all of the edges have a “positive weight” (in our case, distance or speed) associated with them.

There are a couple of well-known algorithms for solving this problem; we’ll use one called Dijkstra’s Algorithm. We describe it below; for more detail, see Section 13.5 of the Goodrich text. You may follow the approach below, the more elaborate one in Goodrich, or another one when implementing Dijkstra's Algorithm, but you must write the actual code for your implementation. In particular, copying of the code from the text will be considered plaigarism, a violation of course academic honesty rules.

Dijkstra’s Algorithm actually finds the shortest path from some start vertex to all the other vertices in a graph— it needs to calculate them all in order to work. This is actually to our benefit. Suppose the file has multiple, consecutive trips starting from the same vertex and measuring the same thing (distance or time). With Dijkstra’s Algorithm, when we compute the first trip, we have shortest paths to all other vertices. So, if the next trip starts from the same place, and is using the same metric (distance or time) as the just-completed trip, we can just look up the shortest path to the destination from the just-computed results from the previous trip. Use this approach in your program; it is likely a "real" file would contain multiple consecutive trips that start from a particular place, and a well-designed program would not re-compute data it already has available.

For each vertex v, Dijkstra’s Algorithm keeps track of three pieces of information: k_v, d_v, p_v.

k_v is a boolean flag that indicates whether the shortest path to vertex v is known. Initially, k_v is false for all vertices.
d_v is the length of the shortest known path from the start vertex to vertex v. When the algorithm begins, no shortest paths are known. Initially, d_v is INFINITY for all vertices, except the start vertex, for which d_v = 0.
p_v is the predecessor of the vertex v on the shortest known path from the start vertex to v. Initially, p_v is unknown for all vertices, except for the start vertex, for which p_v is none.

As the algorithm proceeds, it will need to calculate the cost for individual edges. The cost of the edge from v to w will be called C(v, w). How you calculate the cost depends on whether you’re minimizing driving distance or driving time:

If you’re minimizing driving distance, C(v, w) is the number of miles on the edge from v to w.
If you’re minimizing driving time, C(v, w) is the number of minutes required to drive along the edge from v to w, given its length and traffic speed.

Dijkstra’s Algorithm proceeds in phases. The following steps are performed in each pass:

From the set of vertices for which k_v is false, select the vertex v having the smallest d_v.
Set k_v to true for the vertex you picked in step 1.
For each vertex w adjacent to v (i.e. there is an edge from v to w) for which k_w is false, test whether d_w is greater than d_v + C(v, w). If it is, set d_w to d_v + C(v, w) and set p_w to v.

For each pass, exactly one vertex has its k_v set to true (in other words, we discover one known shortest path per pass).

Here is pseudocode for the algorithm. Notice the use of a priority queue, which allows you to easily find the vertex with the smallest d_v in step 1.

  
    for each vertex v
    {
        set k_v to false
        set p_v to unknown (or none, if v is the start vertex)
        set d_v to infinity (or 0, if v is the start vertex)
    }

    PriorityQueue pq = new PriorityQueue()
    pq.enqueue(start vertex, 0)
    
    while (!pq.isEmpty())
    {
        vertex v = pq.dequeueMin();
        
        if (k_v is false)
        {
            k_v = true

            for each vertex w such that edge v -> w exists
            {
                if (d_w > d_v + C(v, w))
                {
                    d_w = d_v + C(v, w)
                    p_w = v
                    pq.enqueue(w, d_w)
                }
            }
        }
    }

At the conclusion of the main loop, the d_v value corresponding to the end vertex will be the amount of the shortest path. You can find the actual path of vertices by working your way backward from the end vertex to the start vertex, following the p_v values as you go along. (This implies, of course, that you need to store all the p_v values.)

Note that if d_v for a node remains infinity after the algorithm completes, there is no path from the start vertex to the vertex v.

Remember that, after the algorithm has finished with one trip, you should store the results in memory so that you can look them up later for subsequent trips. It turns out you only need to store the p_v values. There’s no reason to store the k_v values, because they will all be true after the algorithm completes. And there’s no need to store the d_v values, because you will need to look up the times or distances between each vertex in the path anyway. Since we always output all of the segments of a trip, you can easily sum these up to calculate a total.

As you can see from the pseudocode for Dijkstra’s Algorithm, you will need a priority queue in order to implement it. You are required to implement the priority queue using a minimum binary heap, such that all enqueues and dequeues are O(log n). (Recall that a minimum binary heap is one where the root of any subtree in the heap has a value smaller than its children.) The priority queue should be implemented in its own class, of course. Remember that you are to write your own code; in particular, do not copy a heap algorithm straight out of a book!

The output

For each of the trip requests in the input file, your program should output a neatly-formatted report to System.out that includes each leg of the trip with its distance and/or time (as appropriate), and the total distance and/or time for the trip.

If the trip request asks for the shortest distance, the output might look something like the following (these are phony trips, to show you the output format; they are not related to the data file above):

  Shortest distance from Alton & Jamboree to MacArthur & Main
    Begin at Alton & Jamboree
    Continue to Main & Jamboree (1.1 miles)
    Continue to Jamboree & I-405N on ramp (0.3 miles)
    Continue to I-405N @ MacArthur (1.3 miles)
    Continue to MacArthur & I-405N off ramp (0.1 miles)
    Continue to MacArthur & Main (0.2 miles)
  Total distance: 3.0 miles

On the other hand, if the trip request asks for the shortest time, the output might look like this:

  Shortest driving time from Alton & Jamboree to MacArthur & Main
    Begin at Alton & Jamboree
    Continue to Alton & MacArthur (2.7 miles @ 33.7 mph = 4 mins 48.8 secs)
    Continue to Main & MacArthur (1.1 miles @ 40.1 mph = 1 min 38.7 secs)
  Total time: 6 mins 27.5 secs

When printing a time, you should separate it into its components—hours, minutes, and seconds—as appropriate. Here are some examples:

  32.5 secs
  2 mins 27.8 secs
  13 mins 0.0 secs
  3 hrs 13 mins 12.3 secs
  6 hrs 0 mins 0.0 secs

If there are zero hours, do not show them; if there are both zero hours and zero minutes, show neither of them.

And if the trip cannot be made, your program should print out a message to that effect, including the start and end points, something like

You cannot make a trip: there are no roads
from the North Pole to Antartica.

Limitations

Except for java.util.ArrayList, you may not use any of the collection classes in the java.util library (e.g. LinkedList, TreeMap, HashMap). You’re free, and encouraged, to use other utility classes, such as Scanner, to parse a line of the file into its meaningful components.

Optional Work

Enhance your program so that it notifies the user whenever the input file is incorrect—and don’t process the trips. For instance, you can check for too many or too few location entires as compared to the number of locations the file says are present. You can tell the user when a road segment’s length or time is negative or zero. You can report that fields are missing from a line. And so on. If many errors (of the sorts for which you are checking) are present, report them all.

Add a designation to locations to show whether they are start only, end only, or both: on real roads (especially freeways), the presence of a way off the road does not imply that there is a way onto the road from that same location. Process trips in the input file that start at a start location and end at an end location, and print error messages for those that do not.

Deliverables

Remember to name the file containing your main() method RockAndRoll.java. Place all of the .java files that are part of this program into the source folder of the project. In addition, if you completed any of the optional work, include in the project folder a README file that explains what features you implemented, and the algorithms you used to implement them. Then ZIP up the project folder and submit the ZIP file to Checkmate.

Minor revision to make clear that, if a trip cannot be made, a message to that
  effect is to be printed, by Norman Jacobson, May 2012.
Originally written by Alex Thornton, Spring 2001, with portions of the description of
  Dijkstra’s algorithm based on a description in Data Structures and
   Algorithms with Object-Oriented Design Patterns in Java by Bruno R. Preiss.
Revised for ICS 23 Fall 2001 by Norman Jacobson, September 2001
Minor revisions by Norman Jacobson, December 2001, March 2002, May 2002, and June 2002
Revised again by Alex Thornton, Fall 2002
Minor revisions by Norman Jacobson, December 2002 and January 2004
Minor revisions by Norman Jacobson, March 2004, to reflect the Goodrich text
Minor revisions by Norman Jacobson, June 2005, to reflect Java 5.0
Minor revisions by Norman Jacobson for ICS23 Spring 2006, March 2006
Revised to stress that comment and blank lines can appear anywhere in the input file, and
   that an input file does not have to allow trips to all map locations, by Norman Jacobson
  for ICS23 Winter 2007, January 2007
Minor format editing by Norman Jacobson, December 2007 and December 2008
Added a README file to document optional work, by Norman Jacobson, January 2009
Made clear that submission is to be a ZIPped folder, by Norman Jacobson, February 2009
Minor format changes by Norman Jacobson, March 2009
Made explicit that dv = infinity means there is no path to v, and fixed a minor typo, by Norman Jacobson, May 2009
Revised to reflect use of Eclipse and thus a project folder, and to effect a few minor edits, by Norman Jacobson, March 2010
A couple of minor edits, by Norman Jacobson, May 2010 and March 2011
Clarifications regarding when paths should not be recomputed, by Norman Jacobson, May 2011
A couple of minor edits, by Norman Jacobson, March 2012