Project Guidelines

CS 277: Data Mining, Spring 2013

The goal of your project is to give you hands-on experience of applying data mining techniques to one or more large data sets, going through the following steps:

Here are the project deliverables and due dates:

Project Deliverable
Due Date

1. Project Proposal

In class, Thursday April 11th

2. Project Progress Update Presentations

In class, Tues Apr 30 / Thurs May 2

3. Final Project Presentations

In class, Tu/Th May 28/30 and Tu/Th Jun 4/6

4. Project Report

EEE dropbox, 12 noon, Thursday June 6 *AND* Paper copy in class

 

Data Sets:
You are to find a data set that has some potential for data mining for your project. The data set should be reasonably large and for a domain area that you know something about or are interested in learning about.

Here is a long list of potential datasets.

Note that you are not allowed to work on a data set you have worked with (or are working with) outside this class (e.g., for a different class project, for research, etc.). So, for example, if you worked on the Netflix prize data before in any significant way (e.g., for a class project), you need to pick a different data set for your CS 277 project.

 

Types of Projects:
Here are some typical types of projects:

 

Software:
You can use already written software (e.g., in MATLAB, R, or other packages or standalone software) and you can also write your own software in any language or environment you wish. You need to clearly list and cite any code that you use and that you did not write, including any code that you modify.

 

Teaming:
Students must work in project teams of 2 or 3. I will expect to see the work of 2 or 3 people, so you will need to think clearly about who will do what on the project, and how to make sure that your project has enough work to justify having 2 or 3 people work on it. For 2 or 3 person teams, your progress report and final report will be a single report, consisting of one common section for the project and then a seperate section describing the work of each individual team member on the project. Grades will be assigned individually (2 people on the same project might get different grades depending on their contributions) - but clearly they will be correlated with the overall quality of the project. So you should think carefully (and choose partners carefully!) before signing up for a team project.

 

Progress Report and Final Report:
More details will be provided closer to the dates that these are due. Your reports should be clearly written and comprehensive. Your final project report should be of the quality that is at the level of a conference or workshop paper. The following outline from Professor Ray Mooney on a possible project report format may be helpful.