ICS 184 / EECS 116 Summer 2006
Assignment #3

Due date and time: Friday, August 4, 11:59pm


Introduction

Thus far, we've focused our energies on the task of querying an existing database, which is a centrally important skill in the field of data management. However, when it comes to real-world database applications, being able to write queries is only a relatively small part of the battle. One of several other necessary skills is the ability to design a new database.

Database design is a complex undertaking, requiring both technical prowess and an understanding of the rules and constraints governing the underlying enterprise. This assignment will begin to cultivate your understanding of database design, by asking you to engage in a few of the tasks involved in the design of a database application, assuming that it has already been decided what tables will be needed and what columns they'll require. A future assignment will allow you to explore how to decompose the system requirements into a set of necessary tables and columns.


The application

In a single-person software development effort, it's often unnecessary to formally keep track of defects and problems; often, it's enough to write things down on sticky notes or keep a "to do" list in a text file. When a development effort involves a larger number of people — be it an in-house development project at a single company site, or a large open-source project with contributors spread throughout the world — it becomes much more important to centrally track bugs and feature requests, so that one person can report a defect or feature request, the task can be assigned to the appropriate person who will make the necessary changes, and it is evident to all involved who the responsible party is. Also of use is a discussion thread included with each defect or feature request, so that developers responsible for fixing bugs and the people who reported them can clarify any issues as the fixes are made and tested.

For this assignment, we'll consider some aspects of the design of a hypothetical Web-based defect-tracking system called Stinkbug. Stinkbug's primary goal is to track defects in software systems, where a defect is defined as something at least one user believes is "wrong" with the system.

The data requirements of the system follow.

With the requirements above in mind, the database will require the following tables. Columns that are part of the primary key for a table are underlined.

The database design is required to include whatever mechanisms will help ensure the integrity of the data, according to the requirements above; particularly because the system is not intended to handle a high volume of transactions, the cost of ensuring data integrity is well worth it, relative to the cost of having incorrect data in the database.


Starting point

I have provided an SQL script that creates (or re-creates) the Stinkbug database. In its provided form, it creates the database and its tables, without specifying any integrity constraints on the tables, and without including any of the other necessary parts of the design. You are required to use this script as a starting point for your work, which is described in the next section.


Your tasks

Part 1: Defining integrity constraints (30 points)

Augment the CREATE TABLE statements in the provided SQL script so that the integrity constraints (including primary key constraints) discussed or implied by the data requirements above will be verified by the database engine whenever possible. The changes you make should be supported by MySQL; include CHECK constraints, even though MySQL will ignore them. None of the constraints will require defining an assertion; avoid the use of assertions, since MySQL does not even parse, much less support, assertions. (In fact, I have yet to find an industrial-strength database product that does. Anyone know otherwise? I'm curious.)

You are not permitted to change the names of the tables, nor are you permitted to change the names or types of any columns, though you may find it necessary to create the tables in a different order than the one given in the provided SQL script; if so, you can rearrange the order of the table creations.

Part 2: Defining views (20 points)

Add SQL statements that add each of the following views to the Stinkbug database.

Part 3: Defining and using stored procedures and functions (30 points)

Write SQL statements that add the following stored procedures and functions to your database. You are permitted to make use of the views you defined in Part 2, if you wish, though it is not required.

In lecture, we've covered a fair amount of syntax for stored procedures and functions in MySQL, but you'll almost inevitably have questions as you work. We'll gladly answer questions, though you might be able to get a quicker answer by checking out the MySQL 5.0 documentation regarding stored procedures and functions, or other parts of the MySQL 5.0 Reference Manual, as necessary.

Part 4: Considering the effect of database design decisions (20 points)

One of the challenges of database design is making the necessary trade-offs, striking the appropriate balance between, say, clarity and efficiency, or between avoiding duplication of information in the database and allowing useful queries to be written without requiring joins. I made several of these kinds of decisions while designing the tables for this assignment, and I'd like you to consider a few of them by answering the following questions.

  1. I selected each user's email address as the primary key for the user table, since each user is only allowed to have one email address, and no two users may have the same email address. This choice will work, since email addresses uniquely identify users, but there are performance implications. If this system were used in practical circumstances, users would need to be allowed to change their email addresses. What changes must be made in the database whenever a user changes his/her email address? How could the database have been designed differently to allow an email address change to be implemented more efficiently?
  2. The view assigned_component_defects could also be implemented as a table, with the information physically stored in the database. What are the trade-offs involved with the decision to implement this as a view or as a table? What are the advantages of implementing assigned_component_defects as a table? What are the advantages of implementing it as a view?

Documentation

As you'll notice in the provided SQL script, it is possible to include comments in an SQL script by beginning a line with a '#' character. While you are not required to write a lot of documentation for this assignment, I would like you, at the least, to point out which portions of the script address which of the tasks you've been assigned. This will make it easier for us to grade your work accurately. It's especially important to include your answers to the questions posed in Part 4 as comments in your SQL script, so that your entire solution to this assignment will be contained in one file for ease of grading.


Some MySQL odds and ends

Using the DATETIME type

The DATETIME data type in MySQL is similar to the DATE type that we used in the previous assignment. The key difference is that DATETIME stores both a date and a time, while DATE stores only a date. MySQL accepts a number of formats for DATETIME values, but I suggest using the 'YYYY-DD-MM HH:MM:SS' format. For example, to insert a new defect into the database reported at February 20, 2006 at 3:24:55pm, you might execute this INSERT statement:

    INSERT INTO defect VALUES(101, 'Stinkbug needs integrity constraints',
        5, 3, 'alex@thornton.com', '2006-02-20 15:24:55')

Using the TEXT type

The TEXT data type in MySQL represents a string of text that could potentially be quite long. Syntactically, the TEXT type behaves no differently from the other character-string types, such as CHAR and VARCHAR; TEXT just has a longer length limit, and is treated differently internally. The exact length limit depends on the platform and MySQL version, but suffice it to say, for our purposes, that TEXT columns will be plenty long enough whenever we've used them.

The need for the DELIMITER statement when declaring a stored procedure or function using the mysql command-line utility

When using the mysql command-line utility, semicolons are a way to indicate that you've supplied all of the input you plan to supply for one SQL statement. (This is the only way for mysql's parser to know when a statement has ended.) Unfortunately, the syntax for stored procedures and functions also uses semicolons, as a way of terminating statements within a stored procedure/function. So, when you create a procedure or function within the mysql utility using the CREATE PROCEDURE or CREATE FUNCTION statements, semicolons within the procedure/function will fool mysql into thinking that you've finished the entire CREATE PROCEDURE command, even though you haven't.

The solution to this problem is to temporarily tell mysql to use something other than a semicolon as a delimiter, the indicator that specifies when a complete SQL statement has been entered. Then, use the new delimiter to terminate the CREATE PROCEDURE or CREATE FUNCTION statement, and, finally, change the delimiter back to semicolon. An example follows.

    DELIMITER //
    
    CREATE FUNCTION always_zero()
    BEGIN
        RETURN 0;
    END;
    //
    
    DELIMITER ;

You'll need to do this in your SQL script, because SQL scripts are executed as though they had been typed into the mysql command-line utility.


Deliverables

Submit a single SQL script that (re-)creates your Stinkbug database, built using the provided SQL script as a starting point. At the bottom of this script, include (as comments) your answers to the questions asked in Part 4 of the assignment.

Please do not include statements that populate your database with test data. We'll supply our own test data if we need it.

Follow this link for a discussion of how to submit your assignment. Be aware that I'll be holding you to all of the rules specified in that document, including the one that says that you're responsible for submitting the version of the assignment that you want graded. We won't regrade your assignment simply because you submitted the wrong version by accident.