CS 175: Project in Artificial Intelligence

Spring Quarter, 2010

[Logistics]   [Schedule]   [Assignments]   [Project]   [Resources]   [Grading]

Logistics:

Instructor: Arthur Asuncion (asuncion@uci), office hours: Friday 1:30pm-2:00pm, 3:30-4:00pm DBH 4059

Teaching Assistant: Yutian Chen (yutian.chen@uci), office hours: Tuesday 2:00-3:30pm, DBH 4059

Location and Times: ICS 180, T/Th 12:30pm-1:50pm

Required Software: MATLAB (information; tutorials)

Textbook: None required. However, we recommend the following optional textbooks for reference:
  • Principles of Data Mining. MIT Press, 2001, D.J. Hand, H. Mannila, P. Smyth.
  • Data Mining. Morgan Kaufmann, 2000. I.H. Witten, E. Frank.
  • Pattern Recognition and Machine Learning. Springer, 2006. C.M. Bishop.
  • Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003. D. MacKay. (free)
Course Goals:
  • To develop a practical AI system (we'll focus on a subarea of AI known as machine learning).
  • To learn how to process and analyze real-world data ("data mining").
  • To become exposed to machine learning algorithms and evaluation techniques.
  • To learn how to use scientific software (MATLAB).
  • To learn how to effectively collaborate with others on a team project.
  • To learn how to wade through technical papers to find relevant information.

Schedule:

We plan to lecture on topics in data mining and machine learning which are of high relevance to the course projects. Some class periods will be fully devoted towards discussing project progress.

3/30: Introduction to data mining, exploratory data analysis [Slides]
Read: Review of Basic Concepts in Probability
4/1: Measurements, Classification [Slides]
See: Matlab mini-tutorial
4/6:Classification, Part 2 (see 4/1 slides) 4/8:Logistic Regression (see 4/1 slides)
Read: Notes on Supervised Learning
4/13:Regression [Slides] 4/15: Regression, Part 2 (see 4/13 slides), Project Discussion [Slides]
Read: Notes on Regression Trees
4/20: Project Proposal Presentations
[Matlab Discussion Section Slides]
4/22: Collaborative Filtering [Slides]
4/27: Collaborative Filtering, Part 2 (see 4/22 slides) 4/29: Clustering [Slides]
5/4: SVMs [Yutian's slides and scripts] 5/6: Project Progress Report #1
5/11: Project Discussion 5/13: Decision Trees [Slides]
5/18: Project Discussion 5/20: Project Discussion
5/25: Project Progress Report #2 5/27: Project Progress Report #2
6/1: Project Discussion 6/3: Project Discussion
N/A 6/10 (Final: 10:30-12:30pm): Project Presentations

Assignments:

In addition to the course project, there will be several assignments to make sure that everyone is up to speed with the machine learning algorithms and MATLAB coding. These assignments are to be done individually.

Project:

Student will develop machine learning algorithms and will evaluate them on real-world data sets. Teams of up to 3 students are allowed for each project. We are going to roughly follow the project guidelines of CS277, the graduate course in data mining. Specifically, the project would entail the following components: The link below contains a list of real-world projects that students are encouraged to select. Students may also propose their own project as long as (1) it is related to data mining or machine learning, (2) it has real-world applicability, (3) it is sufficiently challenging. Please do not propose projects based on data or problems you have worked on in other courses or contexts.

View Potential Project Ideas

Resources:

Here are some additional resources that may be helpful:

Grading:

The grading breakdown will be approximately 70% for the course project and 30% for other assignments. Projects will be graded based on the quality of the developed system, amount of technical progress made, thoroughness of evaluation on real-world data, and quality of technical presentation. If everyone does a great job with these projects, I have no problem giving everyone an A.

Grades will be assigned individually. Students on the same team might get different project grades depending on their contributions - but clearly the grades will be correlated with the overall quality of the project. So you should think carefully (and choose partners carefully) before signing up for a team project.

It is the responsibility of each student to be familiar with the UCI Senate Academic Honesty Policies. For homework assignments and projects, you are allowed (and encouraged) to discuss ideas and concepts verbally with other class members, but you are not allowed to look at or copy anyone else's written solutions or code relating to homework assignments or projects. Of course, if you are part of a team project, you may freely share project materials within the team only. All material submitted must be material you have personally written during this quarter. If you use third-party code in your projects, you must explicitly provide proper attribution to the original developers. If you implement an idea proposed by another researcher, you should provide proper citations in your project report. Failure to adhere to this policy can result in a student receiving a failing grade in the class.

Note: We may make slight changes to the syllabus as the quarter progresses. Portions of this course have been adapted from other courses, and we thank Prof. Smyth, Prof. Welling, and Prof. Ihler for making their slides available for reuse.