Providing Selected Data to Film Libraries;
Upgrading FabFlixs

Phase 5


Overview

As part of its community service efforts, FabFlixs provides movie cataloging informaton to film libraries. FabFlixs has agreed to provide the libraries an XML file of FabFlixs' movie information; the interchange format has been codified in a DTD. You are to produce an XML file of the movie information in the FabFlixs database that meets the agreed-upon requirements as reflected in the DTD and check that it does so.


References and Tools


What to Turn In

In one ZIP or WAR file, labeled with your team name, your employee access (back-end) system updated to produce the XML interchange file along with any and all data, programs, output and reports used to test that the feature works properly, including a ReadMe file explaining how to compile, load, and otherwise prepare your system for use, and how to run it. Make sure your archive's extention is ZIP or WAR, as appropriate.

Use these conventions for the database you submit:

If you are turning in a War file, be sure to include all source code files in your WAR, in the directory sources within the WEB-INF subdirectory; place the ReadMe file here as well. The root directory of the WAR hierarchy should be your team's name.


Preparing the Movie Information XML File

Add to the back-end program an option to produce and check the movie XML file. When invoked, the program does just that: it creates an XML file movieinfo.xml that contains the movie, star and genre information from FabFlixs, formatted according to moveinfo.dtd, the DTD that was agreed upon by FabFlixs and the libraries; it also validates that the format of the file and the information in it, is correct, preparing a report of its findings in HTML that can be read by a modern Web browser. We stress that this should be a "one button" function: it creates the XML file, checks it, produces the report, and returns.

In addition to the requirements given in the FabFlix DTD, if a movie has no stars associated with it, the star_ids attribute list consists of one item, the value "-1".

Validating the format

To validate the XML file's form, a straight-forward way is to parse the moveinfo XML file using SAX, StAX or DOM; their Java libraries provide functions that validate XML files, and parse and make available to your program the contents of the XML file, which your program will use to check the correctness of the XML file's movie information. We focus on SAX here, as it is the simplier approach to accomplish this task; you may use one of the other approaches if you wish.

To familarize yourself with SAX, check out the (very good) SAX tutorial referenced above. We also recommend you run and study XMLtoScreen.java, a Java program that uses SAX to validate and print to the screen the contents of an XML file; a sample XML file and its associated DTD are also provided. Details of SAX classes are available in the Java API documentation; a link to it is in References and Tools, above.

Instead of using SAX, StAX or DOM to verify the movieinfo file, you could write a script to call an exisiting Web-based verifier from within your program, such as one of those referenced above, pass it the XML and DTD files, and have the back-end program capture the resultng output. The Web validators are also a good way to quickly check your XML and DTD files during development; in particular, if those verifiers give results that are different than your back-end system's verifier, some further investigation is definitely in order!

There's no real point to testing the correctness of movieinfo's information if it is not in a valid format: the libraries cannot deal with a malformed file. And, typically, once an error is found, processing ceases (though not always; sometmes you have an option. See the SAX tutorial for details on on handling fatal and non-fatal errors). If a fatal validation error occurs, report it to the user and cease processing the file. If a non-fatal error occurs, continue validating the file, reporting all messages to the user (until a fatal error occurs).


Verifying the Information

As SAX, StAX and DOM validate your file, they also make the file's information available. (If you instead used a separate validator, you will have to read the files again, using either SAX, StAX or DOM to obtain the data.) Place this extracted movie information into a duplicate set of FabFlixs tables that start off empty; we'll call this second database CheckMovies. If the file has no validation errors (and you made no other mistakes in your code), then CheckMoves will contain all the data from movieinfo.xml. By employing appropriate SQL, you can now compare the data in the two databases to be sure the table in CheckMovies contains the same movie information as FabFlixs.

You obviously want your testing to be as thorough as feasible given the time constraints in which development must be completed. At minimum, your tests should include the checks listed below. Some checks overlap; for instance, if all the IDs in the movie table in CheckMovies "match up" with its counterpart in FabFlixs, then the number of records in each table is the same. But it makes sense to get a count of records first, since running this quick test will tell you if something is off a lot faster than going through a list of IDs looking for ones that don't match. How you group these tests, and in what order you do them, is up to you. Report any problems you find. Make sure the report is readable and informative: just throwing a bunch of data at a user is not informative; the report must be organized and presented so the potential problems can be quickly seen, and the information present must be sufficient to allow quick access to the problematic entries.

You are encouraged to add other tests to this suite. Your goal is to convince management beyond any doubt that moveinfo.xml is complete and correct.

If you the write the code that extracts FabFlixs and CheckMovies data such that the output from them is in exactly the same format, you can easily compare these (often quite long) files using a file comparison utilty; such a utility will highlight any differences found between the two files, thus quickly pointing up issues to be investigated. Many such utilities exist. In particular, Windows has one built in called fc. It is issued in the command prompt window; for example

C:\> fc file1 file2
will compare file file1 with file2. Your machine might also have one called WinDiff.exe (depending upon the options used when Windows or other Microsoft products were installed). You can learn more about fc and WinDiff by entering their names into the Search box of the Windows Help and Support feature.

fc's (and even WinDiff's) output is pretty primitive. For more readable and more informative comparisons, use the file comparison feature found in a text editor; for instance, TextPad has such a feature. You can also try a commmercial or shareware tool. I've used a previous version of Araxis' Merge and liked the way it presented results. You can get a 30 day free trial of Merge by signing up at Araxis' web site; the link is given above. (Standard disclaimer: I am in no way affiliated with Araxis.)

Major hint: First do your testing using a FabFlixs database with just enough movie-related records in it to get a clear idea whether your create-and-check-XML-file function is working properly. (There's no need to have any customer-related records in this test database, since they are not going to be part of the XML file.) Then add several more records that are designed to increase complexity; for instance, have a couple of movies with lots of, and that share, stars and genres; test the function again. When you are convinced that the function is working correctly, run a final set of tests on a big, complex database (for instance, containing the test data we provided at the start of the project). This approach will take much less time, and likely result in more thorough testing, then running all your tests solely on a large database.


Demonstration

During the demonstration, you are to run your XML-creation function, using a database of sufficient size to reflect a degree of complexity that approaches what would be found in the "real" FabFlixs database, but not one so large that it takes more than several seconds to produce the XML file. You then run your test suites and show the reviewer your results. Your goal is to convince the reviewer beyond any doubt that moveinfo.xml file is complete and correct.

Written for ICS185 Spring 2005 by Norman Jacobson, March 2005.
   Some sections adapted from a ICS185 Winter 2005 exercise written by Chen Li.
Rewritten to include the enhancements section and other new requirements, to improve clarity, and to remove the XQuery portion previously present, for CS122B, Spring 2007, by Norman Jacobson, May 2007.
Director modifications removed, as they are now part of Phase 1; minor typos fixed by Norman Jacobson, September 2007.
Genre selection and login enhancements removed, as they are now incorporated into Phase 2, by Norman Jacobson, December 2007.
Minor revisions for clarity and some typos corrected, by Norman Jacobson, September 2008.
Revised to include README file requirement, by Norman Jacobson, December 2008.