Informatics 122 Winter 2013, Project #1: Who's Gonna Ride Your Wild Horses?

Background

Understanding a problem domain and the business context

One of the major advantages of having skills related to the design and implementation of sofware is that these skills are easily translated from one industry to another. If you work on a project or job in the mortgage industry, nothing precludes you from doing your next one in something completely unrelated, like health care. At some level, software is software, making software-related skills very powerful to have.

Of course, that's not to say that a switch from one problem domain to another is simple. Whenever you make such a switch, you have a lot to learn about the business context — What problems does the business have? How can software be brought to bear to solve those problems? How do existing systems in the business already solve some of these problems? What other businesses are involved and how can their interests best be served by new or existing systems?

For example, many of you will know very little about the intricacies of betting at racetracks — and even my knowledge of this subject is somewhat limited — yet this is knowledge that you can acquire as needed in order to tackle this project. In that way, this project is a realistic one; for all of the projects I've done in the industry over the years, not one has been something I could do without learning about a business context that was unfamiliar to me when I started. The sooner you can grow accustomed to being out of your element at the start of a project — and yet understand the right kinds of questions to be asking and the right kinds of research to be doing — the better. Unless you happen to be a horse racing afficionado, this project will give you some practice at doing that. (If you are, don't worry; we'll work on something completely different next time.)

Structure of a race day

Race day at a racetrack involves a series of horse races. The races are generally numbered, usually sequentially, though not always starting from 1. Each race allows people to place bets beforehand and collect any winnings afterward.

One horse race involves a collection of horses that are lined up one next to the other behind gates, with each being ridden by a jockey (a person) who will guide the horse through the race. Each of the horses is given a number — usually sequentially starting at 1, though there are variations on this rule at some tracks — to easily distinguish it from the others. Once all of the horses are lined up and ready, the gates open and the horses begin running. Some distance from the start is the finish line. The horses are ordered by their finish (the first horse to finish is said to have finished in first place, the second is said to have finished in second place, and so on). Horses that do not cross the finish line, or whose jockeys break other rules, are disqualified.

The parimutuel betting system

The most common system of betting that is allowed at licensed tracks is the parimutuel system, which is somewhat different from the kind of betting that you do if you play, say, blackjack in Las Vegas. In the parimutuel system, the track holds no interest in the outcome of any race. Instead, the track keeps a percentage of all of the bets that are made on each race, with the money applied toward the operating costs of the track, taxes and fees paid to state and/or local governments, and profit. The average "take" for racetracks in the U.S. is about 17%, with the remainder of the money split among those bettors who correctly predicted the outcome of each race.

(If you're interested in poker, you may recognize that the parimutuel system bears some resemblance to how poker games work. Casinos that run poker games, too, have no interest in the outcome of each hand, but instead take a percentage of every pot; when you play poker at home, there's generally no "take" at all, with all the money in the pot split amongst winners after each hand. Similarly, many state lotteries use a parimutuel system.)

As an example, suppose that 1,000 people each bet $100 on a race. Imagine that 700 of them bet on horse #1 to win, while the remaining 300 bet on horse #2 to win. In total, 1,000 * $100 = $100,000 has been bet on the race. If the track's "take" is 15%, they will keep $15,000, regardless of the outcome of the race, leaving $85,000 to be distributed among the winning bettors. Here are the possible outcomes:

Winning Horse	Payout	Profit for Winning Bettor
Horse #1	$85,000 / 700 winners = $121.43 per winner	$121.43 payout - original $100.00 bet = $21.43
Horse #2	$85,000 / 300 winners = $283.33 per winner	$283.33 payout - original $100.00 bet = $183.33

A big lesson to take from this example is that betting on a less-popular horse pays more, if you're right, than betting on a more-popular one. (There's a reasonably good chance that the less-popular horse has a lesser chance of winning, if you believe in the wisdom of crowds, but longshots nonetheless win races and big payouts do happen. In general, if you accept the behavior of all bettors as indicating probabilities of a horse winning, you're paid according to those probabilities; people who make money betting on horse races, in the long run, are the ones who understand when the crowdsourced probabilities are incorrect and act accordingly, which is a difficult advantage to maintain.)

This simple example leads to a couple of questions worth considering, especially if this is a new problem domain for you.

In this example, every bettor made an identically-sized bet of $100, so every winner got the same payout. What does your intuition tell you about how the payouts will change when the bet sizes vary? (Don't read on yet; think about it for a little while first. Come up with a more complex example where the bet sizes vary, then use your intuition to guess what the right outcome would be. Engaging your mind instead of allowing yourself to be told the rules is a great tool for building domain knowledge quickly. Once you have an intuition, then you'll want to verify that your guesses are correct.)
There are extreme examples, though rare, where the technique demonstrated above would lead to a winning bet that would still lose money. Try updating the example above where 990 of the bettors bet on horse #1 to win and only 10 of them bet on horse #2; how much does a winning bet on horse #1 pay? What do you think is a reasonable way for a track to handle such a case?

Betting pools

When you place a bet on a race, there is more than one kind of outcome you can bet on. There are three different ways you can bet on one horse in a race, for example:

Win. A horse is said to win a race if it finishes first.
Place. A horse is said to place in a race if it finishes either first or second.
Show. A horse is said to show in a race if in finishes either first, second, or third.

So, for example, if you bet on horse #1 to show in race 1, you'll win your bet if horse #1 finishes either first, second, or third in race 1.

The money wagered on each kind of bet is kept in a separate betting pool. In other words, the win pool contains all of the money wagered on horses to win, the place pool contains all of the money wagered on horses to place, and the show pool contains all of the money wagered on horses to show. Payouts for win bets, place bets, and show bets are calculated separately, each from their own pool. In general, show bets are easier to win than the others, so they pay less when you win.

The rules for calculating the payouts for win, place, and show bets are summarized later in this write-up.

Post time, tickets, and claims

Before a race is set to begin — at what is called the post time — bets are accepted. Once post time has been reached, bets are no longer accepted on that race; at that point, the race is cleared to begin.

When a bettor legally places a bet, regardless of its complexity, it is associated with a ticket, which might be a physical printout given to an in-person bettor or a virtual ticket associated with an online bet. Each ticket has a unique identifier that differentiates it from all of the others.

At the conclusion of the race, a bettor can present the ticket back to the racetrack (in-person or virtually) and collect his or her winnings, which is sometimes called making a claim. Naturally, once you've made a claim on a ticket, the track invalidates the ticket so that you can't make another claim against it.

Details of calculating prospective payouts

As we established earlier, each different kind of bet has its own pool associated with it. Additionally, each has its own algorithm for determining the prospective payout (i.e., how much would be paid out to winners). This section summarizes the rules for calculating the prospective payout of each kind of bet, so we can all be sure we're working with the same rules.

Win

Win bets are the simplest ones for which to calculate payouts. Here is the step-by-step formula for calculating the prospective payout for horse x to win.

Begin with all of the money in the win pool
Subtract the track's "take"
Subtract the total amount that was bet on horse x to win. The remaining amount is called the profit.
Divide the profit by the amount that was bet on horse x to win. Round down to the nearest cent. This is called the dividend. (If the profit was zero or negative, as it can be in very extreme cases, the dividend should simply be zero. The dividend should never be negative in any betting scenario!)
The payout for each $1 bet on horse x to win will be $1 + the dividend.

Here is an example. Suppose there are five horses in a race and the following amounts have been bet to win on each horse.

Horse #	Amount Bet to Win
1	$100
2	$200
3	$150
4	$300
5	$150
TOTAL	$900

If horse 1 wins the race, the payout would be calculated as above:

Starting with the $900 in the win pool, subtract the track's "take" (say, 15%) of $135, leaving $765 to be distributed to the winners.
Subtract the amount bet on horse 1 to win ($100), leaving $665 profit.
Divide the profit ($665) by the amount bet on horse 1 to win ($100), yielding $6.65.
The payout per $1 bet on horse 1 would be $1 + $6.65 profit = $7.65.

(Subsequent formulas will not be accompanied by detailed examples, but the concepts are essentially the same throughout.)

Place

Calculating the prospective payouts for place bets is slightly more complicated. Two horses place in every race and the payouts are different for each pair of horses. For this reason, you can't calculate the prospective payout for horse x to place without knowing which other horse placed. But you can calculate the prospective payouts for horses x and y to both place. The process for calculating each prospective payout in that case is:

Begin with all of the money in the place pool.
Subtract the track's "take."
Subtract the total amount that was bet on horses x and y to place. The remaining amount is called the profit.
Divide half of the profit by the amount that was bet on horse x to place. Round down to the nearest cent. This is the dividend for horse x.
Divide half of the profit by the amount that was bet on horse y to place. Round down to the nearest cent. This is the dividend for horse y.
The payout for each $1 bet on horse x to place would be $1 + the dividend for horse x.
The payout for each $1 bet on horse y to place would be $1 + the dividend for horse y.

Show

Not surprisingly, show bets are slightly more complicated than place bets, since three horses show in every race — the first-, second-, and third-place place horses. The same principle that we used to divide the place pool among the two placing horses will be used to divide the show pool among the three showing horses.

In other words, to calculate the payouts for horses x, y, and z to show.

Begin with all of the money in the show pool.
Subtract the track's "take."
Subtract the amount that was bet on horses x, y, and z. The remaining amount is called the profit.
Divide 1/3 of the profit by the amount that was bet on horse x to show. Round down to the nearest cent. This is the dividend for horse x.
Divide 1/3 of the profit by the amount that was bet on horse y to show. Round down to the nearest cent. This is the dividend for horse y.
Divide 1/3 of the profit by the amount that was bet on horse z to show. Round down to the nearest cent. This is the dividend for horse z.
The payout for each $1 bet on horse x to show will be $1 + the dividend for horse x.
The payout for each $1 bet on horse y to show will be $1 + the dividend for horse y.
The payout for each $1 bet on horse z to show will be $1 + the dividend for horse z.

The system

Your customer is entering the racetrack management business, providing services to owners of tracks around the U.S. (and, potentially, around the world). They already have agreements with several track owners to be first-phase users (and who, incidentally, are also early-stage investors in your customer's new company, so they share an interest in the success of the system). Racetrack management is a complex business, so the system will be built in a number of phases; this project focuses only on the first phase, whose central goal is to provide the tracking of bets on races and determine payouts for tickets afterward.

The basic requirements for the first phase of the system follow.

At the beginning of a race day, the day's races will be configured. As discussed above, each race has a race number and each horse in each race has a horse number. Once configured, it is safe to assume that the configuration of races will not change.
Accept win, place, and show bets on individual races.
Generate unique ticket identifiers for each bet and keep track of all outstanding tickets for all races.
Determine potential payouts for tickets based on a hypothetical race result (e.g., "How much would place bet on horse 3 in race 6 pay, given the current place pool?").
Determine actual payouts for tickets, including disallowing the same ticket from being claimed more than once.
The "take" is controlled in the U.S. by state and local regulations. At different racetracks managed by your customer in different places, this value will need to be configured differently.

For the purposes of this project, you are not required to consider the user interface for the system — which is likely to be some combination of on-site systems for racetrack personnel and web-based systems for Internet betting — nor are you required to consider long-term storage of results (e.g., in a database). Effectively, you are designing an API for manipulating in-memory data structures that contain information about a race day. You should assume, of course, that there will be some sort of user interface, and that results will need to be stored long-term in some fashion, so this will inform the design of your API.

Anticipated future changes

While the initial system is intended only to support the functionality discussed above, the customer is known to be considering the following additional changes, among many others, in a future phase. If possible, your design should endeavor not to preclude these changes.

At a real racetrack, horses are sometimes scratched (i.e., it is decided that they will not run a race, due to an injury or worries about the condition of the track), occasionally just before post time. A future version of the system will need to handle this, presumably by allowing any ticket betting on a scratched horse to be refunded and removed from the corresponding betting pool.
The inclusion of "exotic" bets, such as exactas and trifectas, and multi-race "parlay" bets, such as doubles and pick-sixes, is something that the customer is interested in supporting in a later release.
The calculation and dissemination of live odds, which is a simple way to estimate payouts based on the amounts of money in the win pool for various horses.
The system will later need to support saving historical results into a database so that reporting can be done. (Off-the-shelf tools that can generate a variety of reports from databases are common; they would like to eventually be able to use these tools.)

Try to anticipate other reasonable changes that might be required by the customer, given your understanding of the problem domain and business context. You don't have to consider every possibility — that's simply impossible and can be counterproductive to attempt — but the general rule is that flexibility is better than the lack thereof, if you can find a way to provide it.

Part 1

Deliverables

Create an object-oriented design for the system, expressed as a UML class diagram. You should assume that the implementation language is Java, so your design should use only features that you would find in Java (e.g., no multiple inheritance) and you should use any Java feature that you feel is appropriate (e.g., interfaces, properly marked in your diagram using the <<interface>> attribute). Your UML class diagram should include the signatures of public methods, along with the types and names of any private fields and the signatures of any private methods you think you'll need in each class.

There is a wide variety of tools available for drawing UML class diagrams and, in general, we don't have a preference about which one you use. For those of you working in the ICS labs, Microsoft Visio — which is also available as a free download for students in this course via the MSDN Academic Alliance program — is available. You may also draw your diagrams legibly by hand and scan them for submission, but we do not recommend this approach, as this design has a number of moving parts and will likely require a fair amount of work and rework in order to get right.

However, we do have one requirement: that you can generate a complete, legible version of your diagram as a (possibly multi-page) PDF file, which you will need to submit electronically. Because there are so many tools available, we are not likely to have had experience with particular tools you might find online, so it will be your responsibility to verify whether you can generate a diagram in PDF format (either directly through the tool or using some other tools to transform your result) before committing yourself to using a particular tool. Submitting a UML class diagram along with instructions on what we should install in order to view it is not acceptable for the purposes of this project; it must be a PDF file. For those of you drawing your diagram by hand, most scanners will scan directly to PDF files.

How to know whether your diagram is good

When confronted with this project, some of you simply may not have a clear understanding of what you're aiming for. Your UML class diagram, in this project, is a way of communicating a fairly detailed view of your design to a potential implementer. While, in many practical scenarios, that implementer might be you, it's also quite possible for it to be someone else (or a team that includes you and others).

Your overall goal should be to communicate the details of your design — what classes are required, what responsibilities they have, and how they interact with and depend on one another — as completely as possible. Ideally, it should be possible for someone other than you, given a similar understanding of the problem, to take your UML class diagram and implement the design that it describes without having to ask you very many questions. This is why it is crucial for you to include design decisions such as fields, methods and parameters, and the various kinds of relationships (e.g., association, generalization, aggregation, and composition), along with descriptive names on those relationships and additional textual notes where appropriate for clarity.

If you flesh out your ideas like this, you'll stand a much better chance of leaving someone in a position to be able to implement your design as specified (or nearly so), so long as your underlying design is also a good one.

How to know whether your design is good

In lecture, we discussed a number of principles of good design — e.g., the DRY and YAGNI principles, separation of concerns, information hiding, high cohesion and low coupling, acyclic dependencies, classes having single responsibilities — which are useful things to keep in mind as you put your design together. If you're following these principles, your design is likely better than it would be if you aren't.

One overarching principle to bear in mind, as well, is that your design should be resilient to change. This is obviously a tough nut to crack; how can you know how the requirements will change in the future? The short answer is that you can't, but you can do a couple of things to give yourself a handle on how flexible your design is:

As you're no doubt working on your design in stages, adding support for one or a small handful of requirements at a time, are you finding yourself having to make radical adjustments to portions of the partial design you already finished in order to accommodate the new requirements? If so, this is a strong indication that you haven't been providing yourself enough flexibility.
When you think you've got a good idea for some part of your design, sketch out a portion of the UML class diagram, then consider a couple of reasonable ways in which the requirements might change. What would you have to change in your design to accommodate the new requirements you dreamed up? It's not especially important that you're "guessing right" about how requirements will really change in the future; the point of the exercise is to see what kinds of change your design will accommodate easily. For example, if adding the requirement that the racetrack's take can be configured differently on different days would require substantial redesign work, this is a symptom of inflexibility. (More to the point, it's a symptom that you haven't isolated the design decision of how you determine, at a given point in time, what the racetrack's take will be.)

Deliverables

Part 1 is due on Tuesday, January 22 at 5:00pm (i.e., the beginning of lecture). Submit a PDF file containing your UML class diagram to Checkmate before arriving at lecture.

Follow this link for a discussion of how to submit files via Checkmate. Be aware that we'll be holding you to all of the rules specified in that document, including the one that says that you're responsible for submitting the version of your files that you want graded. We won't regrade a project simply because you submitted the wrong version accidentally.

Part 2

Peer Design Review

The lecture on Tuesday, January 22 will be devoted to a peer review of your design from Part 1, where you will share your design with other students and receive feedback from other students about their perceptions of its quality.

In order to participate fully in the peer review, you will be required to come armed to participate, which means (at least) the following.

Come with an open-minded readiness for receiving constructive feedback. It's safe to say that no one in the room — me included, were I to submit my own design for feedback — would have a perfectly airtight design that is impervious to criticism.
Similarly, bring with you a willingness to express reasoned criticism of other people's work. The goal is to talk about design issues and suggest solutions, not to criticize people. Choose your wordings carefully; respectfully offered feedback will make it easier to respectfully receive it. While this kind of thing can be uncomfortable to some, it is a vital skill, not only in the software industry, but in any industry.
Bring three printed copies of your UML class diagram. Make sure your name appears on all three copies, and assume that you will not get to leave with the copies you arrived with.
Before arriving, be sure you've read through the Peer Design Review form, which will orient your thinking about what you should be looking for in your reviews. While you will not be filling this form out during the lecture time — which I'd like you to spend looking at, asking questions about, and providing feedback on each other's designs — you will need to write these up and submit them later.

As you work on your reviews during the Peer Design Review session, feel free to mark up the copies of the designs that you're reviewing, as well as making notes about the issues you find and the things you discuss. At the conclusion of the Peer Design Review session, be sure to bring with you copies of the designs that you reviewed, as you will need them in order to complete Part 2.

Documenting your findings

After your reviews are completed, retreat to a quieter locale and look again at each of the UML class diagrams that you reviewed during the Peer Design Review lecture. Having reviewed all three of them once, and given a chance to consider them in a bit more depth without the constraints of time and the distractions of having other people in the room, complete the Peer Design Review form for each of the designs.

Your goal here is to engage with the designs fully, which means that you'll want to spend a little time evaluating them carefully and being sure that you understand what's there, what's clear, and what's missing.

One important thing you'll want to decide in this process is which of the designs would be most easily implemented without requiring changes. Recognizing better and worse designs is a valuable skill in its own right, and you can be sure that a good choice here will pay offer a little later this quarter.

When and what to submit

Submit your completed Peer Design Review forms to Checkmate. These are due on Thursday, January 24 at 5:00pm (i.e., at the beginning of lecture). You may submit each as a Microsoft Word (.doc or .docx) or PDF (.pdf) document. The name of each file should be the name of the student whose work you evaluated (along with the appropriate extension); so, for example, if one of the designs you were evaluating was mine and you were submitting a PDF document, you would name that file Alex Thornton.pdf.

Follow this link for a discussion of how to submit files via Checkmate.