Machine Learning Winter 2011

Classnotes (last updated xxxx)

Q&A Blog on Machine Learning(last updated xxxx )

Note: the topics will probably slightly change.


Exams:

There will NOT be an exam on March 16 in week 11. Instead there will be a comprehensive quiz on Friday March 11 in class. It's similar to 2 questions from this exam.
There will be brief project presentations on March 2,4,7,9 in class. Please sign up.
Your project report is due Th. March 10 midnight in a EEE dropbox. Please keep it brief (say ,up to 5 pages). Describe what you did. Make very sure you explain your own contribution and how it is different from what the other ion your group have done. Also make sure you are precise about all input you received, literature referencves etc. Evidence that you submitted to the leaderboard fo some competition is highly desriable. A scientific analysis is also desriable, e.g. report on how well do various methods you tried compare, what are the weaknesses and stengths of the methods etc.
Your HW, quizzes, comprehensive quiz, and project make up your final grade.


CompSci: 273A
Instructor: Max Welling

Office hours: Thursday 4-5pm 


Prerequisites

ICS 271 Intro AI, or with consent of instructor.


Goals:

The goal of this class is to familiarize you with various state-of-the-art machine learning techniques for classification, regression, clustering and dimensionality reduction. Besides this, an important aspect this class is to provide a modern statistical view of machine learning.


Grading & Homwork:

The course will primarily be lecture-based.

Every week there will be homework mostly consisting of a coding assignement. Coding will be done in MATLAB. Finishing all your homework assignments is required for passing this class. If you do not hand in your HW before the weekly deadline, you will acculumate penalty points (or viewed more positively, if you hand in your HW in time every week you will accumulate bonus points). I may ask students to demo their implementation of the HW assignment in class.

There will be short quizzes every week, starting in week 2, but no midterm. The decision on a final will be taken later.
Quizzes will be multiple choice, and you will need to buy Green scantron test forms in the UCI bookstore (both variety of large green form will work).

There will also be a project, starting right from day 1. Details are given below. You may work in group's of up to 5 students but every member must do their own coding work. Each group will also be required to do a presentation and write a brief report (as a group). It is important that the various contributions of the groupmembers will be made transparent.

Your final grade will be determined as a combination of Homework, Project and Quizzes and perhaps exams.


Projects:

Pick a project from http://www.kaggle.com/ and enter the competition.
You are required to submit your code through a dropbox in EEE and write a brief report with your group.
You must show evidence that you participated in the Kaggle competition (hopefully by winning the cash award :-) )
You may work in groups no larger than 5 people
More details will be explained in class.
another compeition: http://www.causality.inf.ethz.ch/unsupervised-learning.php#cont
You may also define your own project.
This project may not simply be whatever you were doing anyway for your PhD research.
Please submit a proposal for my review


Slides:

These slides will change.

-Introduction, kNN, Logistic Regression, Xvalidation, overfitting [ppt] [pdf]
-Decision Theory, Loss functions, ROC curves [ppt]pdf]
-Regression, Bias-Variance Tradeoff [ppt] [pdf]
-PCA & Kernel PCA [ppt] [pdf]
-Ridge Regression & Kernel RG (classnotes)
-Clustering
[ppt] [pdf]
-DT
[ppt] [pdf]
-Spectral Clustering (classnotes)
-Ensemble Methods (classnotes)
-Boosting [ppt] [pdf]
-FisherDiscriminant Analysis (classnotes)
-Canonical Correlation Analysis & Maximum Autocorrelation Factor Analysis (classnotes)
-Convex OPtimization (classnotes)
-SVMs
[ppt] [pdf]
-Classifier Evaluation [ppt][pdf] (probably just for your own interest)



Data Sets and Online Resources:

Homework Dataset: Iris Flower


Homework : (always due the next Tuesday at 12pm)

These homework exercises will change.

Week 1: Homework 1 [doc] [pdf]
Week 2: Homework 2 [doc] [pdf]
Week 3: Homework 3 [doc] [pdf]
Week 4: Homework 4 [doc] [pdf]
Week 5: Homework 5 [doc] [pdf]
Week 6: Homework 6 [doc][pdf]
Week 7: Homework 7 [doc] [pdf]
Week 8: Homework 8 [doc][pdf]
Week 9: Homework 9 [doc] [pdf]
Week 10: Homework 10 [doc][pdf] (probably just for your own interest)


Textbook

I am writing my own text which will be posted here and on the top of this webpage. Note that I will continuously update it during the course and feedback is appreciated.


The textbook that will be used for this course is:

1. C. Bishop: Pattern Recognition and Machine Learning


Optional side readings are:

2. D. MacKay: Information Theory, Inference and Learning Algorithms
3. R.O. Duda, P.E. Hart, D. Stork: Pattern Classification
4. C.M. Bishop: Neural Networks for Pattern Recognition
5. T. Hastie, R. Tibshirani, J.H, Friedman: The Elements of Statistical Learning
6. B.D. Ripley: Pattern Recognition and Neural Networks
7. T. Mitchell: Machine Learning. (http://www.cs.cmu.edu/~tom/mlbook.html)