ICS 33 Summer 2013
Project #4: No Line on the Horizon

Due date and time: Monday, August 26, 11:59pm


Introduction

In business, scientific, or engineering contexts, experimentation is often a very pricy endeavor. Perhaps the potential benefit of finding a better solution to a problem might be extraordinary, but you don't always know if your ideas will yield those benefits without trying them out. But if the experimentation is too costly, you might never be willing to pay the cost of finding out whether your idea is a worthwhile one or not.

Computers have a role to play in situations like this. At their most fundamental level, computers are tools for automating processes that are otherwise more expensive. If you have a strategy for solving, say, an engineering problem, and if you could write a program that will show you how your strategy is likely to turn out, you can more cheaply find the answer you're looking for — should I follow this strategy or not? — and, if it turns out to be positive, you can proceed with making the bigger investment in trying it out "for real."

Suppose you have an idea for a better design for an airplane wing, a more efficient way to serve customers in a restaurant, or a more profitable way to participate in financial transactions. Short of observing the actual results of building and testing the new wing, adjusting the way that customers are served, or making a large number of financial transactions in real conditions — expensive and risky propositions! — a computer program called a simulation can be developed instead. Simulations approximate the important parts of a real-world situation — the physical forces acting on an airplane wing, the movement of customers through a business, the hypothetical change in prices in an financial market — while ignoring aspects that have little or no affect on the outcome you want to measure. Depending on how believably the simulation approximates reality, the results of the simulation might well be a reasonably accurate approximation of reality, as well.

In this project, we'll consider a situation like the one described above. Suppose that the movie theater chain Millennium Cinemas is interested in finding ways to attract customers away from their larger, more well-established rivals. With the proliferation of "megaplex" movie theaters in recent decades, Millennium management believes that one problem that movie-goers face is the long wait in line to buy tickets. They seek a solution that will allow their customers to spend less time in line, hoping that this will dramatically improve their sales. It's not in their best interest to spend a lot of time running experiments, especially since some of the things they might want to try could well end up lengthening their customers' waits in line, ultimately hurting their reputation and their business. (When you're a large company, you can spend money trying things that might have a negative impact, absorb the loss, and move on; when you're small, you don't have this kind of freedom.)

So, to aid in their analysis of the time that customers spend in line, you'll build a program that will simulate various arrangements of lines and ticket windows, to allow them to find the optimal combinations for its theaters. The project will give you practice implementing your own custom data structure called a queue, while helping you to see the value of "layering" a solution to a complex problem. You'll also have a chance to continue developing your design skills, as you'll be implementing this project mostly from scratch. Don't worry, though; we'll give you plenty of help along the way.


The program

Millennium requires a program that will allow them to simulate a variety of arrangements of ticket lines, ticket windows, and patterns of customer arrival. You will write a program that gives them this flexible simulation ability. It will allow the user to specify the number of ticket windows, whether there will be a single line or a separate line per window, and the speed at which each ticket attendant can process customers. Once these simulation parameters have been specified, the program processes a file that indicates the arrival of customers, simulating their arrival into the ticket lines, tracking how long each customer spends in line, and ultimately calculating statistics about them.


Window and line arrangements

Millennium has many different styles of theaters, from single-screen theaters showing classic or foreign films to larger multi-screen varieties showing the latest releases. The outside areas of all of these theaters, big and small, are organized in the same basic fashion:

To support its various sizes and styles of theaters, it wishes to investigate the relative effect of two arrangements of ticket windows and lines.

  1. One or more ticket windows, each with its own separate ticket line.
  2. Multiple ticket windows, and one ticket line that all customers stand in; a customer can proceed from this single line to any available ticket window.

For each arrangement of lines, they will also be interested in investigating the number of windows necessary to support various numbers and patterns of customers. They will use the simulator to determine the contexts in which each arrangement will be appropriate.

As with many simulations, your goal is not to demonstrate which arrangement is best; your goal is merely to demonstrate the theoretical effect of a particular scenario given as input.


The input file

In order to simulate different theater configurations, the simulator will need to accept a set of parameters that describe them. Those parameters will be described in an input file, a text file that specifies the length of the simulation, the number of ticket windows, whether there is a single ticket line or a separate line for each window, and the speed at which customers are processed at each window. In addition, the input file will contain a sequence of customer arrivals, specifying how many customers will arrive at the theater at which times. Here is an example of an input file. (Note that the italicized portions are included here for descriptive purposes only, and would not be included in an actual input file.)

Short simulation     brief description of the simulation
5                    the length of the simulation, in minutes
2                    the number of ticket windows
S                    how many lines: "S" for single, "M" for multiple
45                   number of seconds it takes to process a customer at window #1
35                   number of seconds it takes to process a customer at window #2
1 30                 one customer arrives 30 seconds into the simulation
5 35                 five customers arrive 35 seconds into the simulation
3 45                 three customers arrive 45 seconds into the simulation
1 60                 one customer arrives 60 seconds into the simulation
1 90                 one customer arrives 90 seconds into the simulation
END                  the "END" tag marks the end of the customer arrivals

Notice that the input file is broken into two parts: the configuration section, which describes how the theater is configured (e.g., the arrangement of ticket windows and lines), and the customer arrival section, which lists the arrivals of customers.

The configuration section

The first line of the configuration section is a brief description, perhaps a sentence or two, that explains the purpose of the simulation; this is useful to include in the simulation's output, so that it's clear what scenario is being tested. The next line specifies the length of the simulation, measured in minutes. Following that, the next line specifies how many ticket windows the simulated theater should have. This is followed by either S or M on a line by itself, which specifies whether there is a single ticket line or a separate ticket line per window. Finally, there is a line for each of the windows — they should be numbered from 1..n — that states how many seconds it takes the attendant at that window to sell a ticket to a single customer.

There must be a positive number of ticket windows, though there is no pre-defined maximum. The number of seconds that each attendant takes to process a customer must also be a positive number.

Note that the theater described in this configuration section is an idealization of a real theater; there are some details that have been left out for the purposes of simplicity:

We'll see a handful of additional simplifications along the way; they've been made to keep the simulation somewhat simpler and more predictable, at the cost of skewing the results slightly.

After your program has read the configuration section, the simulator should be prepared to begin processing the arrival and departure of customers, so any applicable objects and data structures should have been created and initialized appropriately before reading any of the customer arrivals.

The customer arrival section

Each line in the customer arrival section of the input file describes the arrival of a number of customers at a particular time. The time is specified as the number of seconds since the simulation started, and they must increase as the file is processed (i.e., the time on each line in the file must be strictly greater than the time on the line that came before it). The number of customers on each line must be positive.

When a batch of customers arrives, they each get into a line. If there is only a single line, obviously they all get into that line. If there are multiple lines, each customer should be separately processed, each getting into the line that has the fewest number of customers already in it (regardless of whether any corresponding window is occupied). In the event of a tie, a customer should always get into the lower-numbered line. There is no maximum number of customers that can be in a particular line, but, in general, they will spread relatively evenly between lines given this rule.

When a window is unoccupied, a customer immediately moves from its corresponding ticket line to the window. (For example, if a customer departs a window at time 60, another customer will arrive in the window at time 60 if one is waiting in the corresponding line.) That customer will stay for the appropriate number of seconds. At that time, the customer will leave the window and will immediately be replaced by another.

For the sake of simplicitly, we'll assume in our simulation that customers never move from one ticket line to another. In other words, suppose there is a separate ticket line per window. If there is no one in the first window and no one in the first line, but there are three customers in the second line, we will assume that the customers in the second line will not go to the first widnow. (Obviously, this is a non-issue when there is only one single line.)

Assumptions about the input file

You may assume that the input file is formatted as described above. You may not assume that it will be exactly the file that we're showing as an example — and we can guarantee you that we'll test cases other than the one shown as an example — but you may assume that there won't be a word where a number is expected, a negative number where a positive one is expected, and so on.

Given an illegal input file — one that doesn't follow the rules described — it's fine for your program to misbehave or even crash.


Running your program

To simplify our testing, there are two rules you'll need to follow in your design.

We will have some test automation built around these requirements, so be sure to follow them, or we'll find that your program fails all of our tests.


Simulation time

As your simulation progresses, it must keep track of a notion of time. Things happen in the simulation at specified times — customers arrive at particular times, they stay in windows for a specified length of time, and so on. It's important to understand, though, that simulation time is not the same thing as real time. A simulation that is intended to determine the movement of customers in an eight-hour day at a theater shouldn't take eight hours to run; since there's no visualization component here (i.e., we're not seeing a graphical representation of customers moving around), the only requirement is to see the results, so the goal is to calculate those results as quickly as possible.

The simulation's notion of time is guided by a simulation clock. Everything scheduled and measured in this simulation is done as a number of seconds, so the clock tracks how many seconds of simulation time have elapsed since the simulation started. After a simulation has been running for two minutes of simulation time, we'd say that it's time 120 on the simulation clock. Seconds are discrete; there's no notion of a time 120.5, only a time 120 and a time 121.


The simulation log

While the simulation runs, your simulator should print a simulation log to the console. The simulation log contains a line of output each time something "interesting" happens. Each line of the log should show the current simulation time and a brief description of the event. Include (at least) these events in your simulation log:

The simulation log for the example input file

The example input file would produce a simulation log with (at least) these events listed.

Time    30 - Customer entered line #1
Time    30 - Customer exited line #1 (spent 0 seconds in line)
Time    30 - Customer entered window #1
Time    35 - Customer entered line #1
Time    35 - Customer entered line #1
Time    35 - Customer entered line #1
Time    35 - Customer entered line #1
Time    35 - Customer entered line #1
Time    35 - Customer exited line #1 (spent 0 seconds in line)
Time    35 - Customer entered window #2
Time    45 - Customer entered line #1
Time    45 - Customer entered line #1
Time    45 - Customer entered line #1
Time    60 - Customer entered line #1
Time    70 - Customer left window #2 (spent 35 seconds at window)
Time    70 - Customer exited line #1 (spent 35 seconds in line)
Time    70 - Customer entered window #2
Time    75 - Customer left window #1 (spent 45 seconds at window)
Time    75 - Customer exited line #1 (spent 40 seconds in line)
Time    75 - Customer entered window #1
Time    90 - Customer entered line #1
Time   105 - Customer left window #2 (spent 35 seconds at window)
Time   105 - Customer exited line #1 (spent 70 seconds in line)
Time   105 - Customer entered window #2
Time   120 - Customer left window #1 (spent 45 seconds at window)
Time   120 - Customer exited line #1 (spent 85 seconds in line)
Time   120 - Customer entered window #1
Time   140 - Customer left window #2 (spent 35 seconds at window)
Time   140 - Customer exited line #1 (spent 95 seconds in line)
Time   140 - Customer entered window #2
Time   165 - Customer left window #1 (spent 45 seconds at window)
Time   165 - Customer exited line #1 (spent 120 seconds in line)
Time   165 - Customer entered window #1
Time   175 - Customer left window #2 (spent 35 seconds at window)
Time   175 - Customer exited line #1 (spent 130 seconds in line)
Time   175 - Customer entered window #2
Time   210 - Customer left window #1 (spent 45 seconds at window)
Time   210 - Customer exited line #1 (spent 150 seconds in line)
Time   210 - Customer entered window #1
Time   210 - Customer left window #2 (spent 35 seconds at window)
Time   210 - Customer exited line #1 (spent 120 seconds in line)
Time   210 - Customer entered window #2
Time   245 - Customer left window #2 (spent 35 seconds at window)
Time   255 - Customer left window #1 (spent 45 seconds at window)
Time   300 - Simulation ended

The final report

When the simulation is complete, your simulator should print out the final report. The final report includes summary statistics about the simulation and should be printed to the console. The final report must contain the following information.

Percentages and fractions should be reported to two decimal places (e.g., "83.33%" or "35.75").

The statistics for the example input file

There are the statistics that should appear in your final report after running a simulation given the example input file. Note that the decimal numbers may be off by a hundredth in some cases for you, depending on how (and whether) you handle rounding, and that's fine.


How about some additional examples?

To test your simulator, you'll want to run more examples than just the one example file that I provided. However, only one example is being provided here. Part of the goal of writing a program is to find a way to be sure that it works; that means that you need to test it. In the case of a simulation, that means you need to come up with some interesting inputs, figure out for yourself what the output should be for each, then see if your program generates the right output. That's part of what I'm expecting you to do for this project, though you will not be required to submit your additional input files. In fact, because it can be an arduous task to build input files and figure out what their expected output should be, I encourage you to share your input files and expected outputs with one another, so that everyone can benefit from one another's insights about testing the program. (Obviously, it should go without saying that you should not be sharing your program code.)


Some suggestions (and requirements) about the design of your simulator

This is a large problem with a number of moving parts, so there are many ways that you could implement your simulator. In general, I am not imposing an overall design on you, though I will make a few specific requirements, in addition to the ones listed above in the section titled Running your program; your goal, where you have design freedom, is to find ways to break up large, complex tasks into simpler ones. Be on the lookout for concepts in the simulation that could nicely be turned into classes or functions.

A linked-list-based queue implementation

As we'll see in lecture, linked data structures such as linked lists provide an alternative to array-based lists like Python's built-in list. Array-based lists excel in ways that linked lists don't and vice versa, and it's handy to know these differences, so you can make an informed choice when you need to store a list of objects; sometimes, it's vital to make this choice correctly if you want to write a program that will perform reasonably.

Your implementation is required to be broken up into two separate classes: a LinkedList class that implements a singly-linked list with head and tail references, and a Queue class that implements the concept of a queue "on top of" the LinkedList class. (We will see an example of this kind of layering in lecture.) In particular, you will need to provide (at least) the following functionality in these classes:

Your LinkedList class should be written in a module named linkedlist33.py, and your Queue class should be written in a separate module names queue33.py. We will be testing your LinkedList and Queue classes separately as part of your grade on this project, so following this naming convention and the implementation requirements above will be important.

Notably, neither your LinkedList nor your Queue classes should be specific to this simulator; they shouldn't include any notion of "customers", "ticket lines", and so on. These are tools on top of which you can build bigger things.

Sanity-checking your LinkedList and Queue classes

As we did in the previous project, we will be testing part of your project — namely, the LinkedList and Queue classes — by running a set of unit tests against them, which verify that the behavior described above is present and working as it should. Once again, it will be necessary that you've spelled and capitalized things correctly, put underscores where they belong, and so on.

To assist you in verifying the presence of the things that are required, below are links to a set of sanity-checking unit tests for your LinkedList and Queue classes. As in the previous project, they do not assert any behavior, but they do verify that the necessary things are present.

As before, do not include these sanity-check tests in your own set of unit tests; your unit tests should instead focus on whether the necessary functionality does what it should.

Deciding on your classes

A somewhat simplistic, but often effective, way to break a large design into classes is to consider a description of the desired system and "look for the nouns." While this technique won't give you a complete or perfect design, it will at least give you a good feel for what the major abstractions in your program are. Using that approach for this program, you might discover some of the following ideas (and probably others). Of course, depending on how you describe your program, you may discover different abstractions than these.

The simulation

Your program should load the setup information from the input file first, and then setup the simulation (i.e., create the theater object with the appropriate initial state) based on those values. From there, you can implement the simulation in various ways, but I suggest the following approach. (This is by no means the most efficient way to implement such a simulation, but is efficient enough for our purposes, striking a good balance between efficiency and ease of implementation for this course.)

Write a simulation loop, each iteration of which simulates one second. Each second, the loop performs the following major tasks:

When the specified number of seconds of simulation time elapse, the simulation ends, and the accumulated statistics should be printed out.

Memory usage

You must not load all of the customer arrivals into memory at the beginning of the simulation. Instead, you should read the customer arrivals into memory one line at a time, when necessary. Reading all of the customer arrivals into memory ahead of time will yield a substantial penalty on your Quality of solution score for this project.

Spoken in terms of O-notation, if there are n customer arrivals specified in the input file, you should use O(1) memory to process them; it should be possibly to run a simulation using an input file with billions of customer arrivals — assuming that this doesn't result in billions of customers standing in lines at the same time — without running out of memory.


Unit testing

As with recent projects, you will be required to write unit tests using Python's built-in unittest module. The only things you are required to unit test are your LinkedList and Queue classes, though you can feel free to include as many test modules and test as many parts of your program as you'd like.

There is wisdom in implementing LinkedList and Queue, along with unit tests, before proceeding with the rest of the project, as a "broken" implementation of LinkedList and Queue will virtually guarantee that your simulator will have bugs in it.


Limitations

Third-party libraries — i.e., anything not included in a standard Python 3.3.2 installation — are strictly off-limits in this project. Other than the standard Python library, all of the code should be written solely by you.

Additionally, pre-existing implementations of linked lists or queues (such as collections.deque) are also off-limits, as one of the objectives of this project is learning to implement your own linked data structures.


Deliverables

Put your name and student ID in a comment at the top of each of your .py files, then submit all of the files to Checkmate. Take a moment to be sure you've submitted all of your files and be sure you submit the right version; we will only be able to accept the files you submit before the deadline, so forgetting to submit one (or submitting the wrong version) can have a significant impact on the score you receive for this project.

Follow this link for a discussion of how to submit your project via Checkmate.

Can I submit after the deadline?

Yes, it is possible, subject to the late work policy for this course, which is described in the section titled Late work at this link.