Course: Kernel-Based Learning

ICS: 273B
Instructor: Max Welling
Time: Tu-Th: 3.30-4.50 pm
Place: CS 174

I&C Sci   273B     KERNEL-BASED LEARN

Code

Typ

Sec

Unt

Instructor

Time

Place

 36765

Lec

A

4

WELLING, M.

TuTh   3:30- 4:50p

CS 174

 

Office hours: Fridays 12 pm-1 pm.


Prerequisites

ICS 273-Machine Learning, or consent of the instructor.


Goals:
The main goal of this course is to introduce the students to one of the most influential
developments in modern machine learning, namely kernel methods. The course will
be focused on familiarizing the student with a number of practical kernel-based algorithms
(such as “support vector machines”, “kernel Fisher Discrimination”,
“kernel principal components analysis” and “Gaussian processes”) and a number of
techniques to construct kernels (such as ANOVA kernels, string kernels, graph kernels,
diffusion kernels, set kernels). The necessary learning-theoretic preliminaries will be
treated as well but it will not be the focus of this course. Applications to real-world problems
will serve as examples.

Students who have successfully finished this course should be ready to apply kernel techniques
to their respective research areas.


Homework :
(When asked to read chapters from the book, technically, you are only required to understand the mathematical
details presented in class. Please do take the time and interest to read over all the details so that you have a
feeling for the kind of material that we didn’t cover.)

Week 1 (11/01/05) Introduction class-slides
Read and study chapter 1 in “Kernel Methods for Pattern Analysis”.

Week 1 (13/01/05) RKHS  class-slides
Read and study chapter 2-skip: 2.2.1 & 2.2.2.
                          chapter 3-until (but not including) page 64 / Theorem 3.13.

Week 2 (18/01/05) Convex Optimization
Read and study  lecture notes until where we ended in class.

Week 2 (20/01/05) Convex Optimization & SVMs
Read and study  lecture notes on convex optimization until where we ended in class.
                           lecture notes on SVMs until where we ended in class.
                           chapters (Ch.7 p.211-230) in book on SVMs.

Week 3 (25/01/05) SVMs
                           lecture notes on SVMs until where we ended in class.
                           chapters (Ch.7 p.211-230) in book on SVMs.

Week 3 (27/01/05) SVMs and Ridge Regression
                           lecture notes
on Ridge Regression until where we ended in class.
                           chapters in book (ch.2 p.31-32, ch.7 p.230-234) on SVMs.

Week 4 (01/02/05) Ridge Regression
                           lecture notes
on Ridge Regression until where we ended in class.
                           chapters in book (ch.2 p.31-32, ch.7 p.230-234) on SVMs.

Week 4 (03/02/05) Support Vector Regression
                           lecture notes
on SV-Regression until where we ended in class.
                           chapters in book (ch.7 p.234-240) on SVR.

Week 5 (08/02/05) Midterm

Week 5 (10/02/05) Fisher Linear Discriminant Analysis
                           lecture notes
on FLDA until where we ended in class.
                           chapters in book (ch.5 sec.5.4 p.132-139) on FLDA.

Week 6 (15/02/05) Fisher Linear Discriminant Analysis
                           lecture notes
on FLDA until where we ended in class.
                           chapters in book (ch.5 sec.5.4 p.132-139) on FLDA.

Week 6 (17/02/05) Novelty/Outlier Detection
                          
Read relevant chapters in this paper on Novelty detection.
                           chapters in book (ch.7 p.195-211).

Week 7 (22/02/05) Gaussian process Regression
                             Required reading: pages 25-30 in this thesis by Joaquin Quinonero Candela
                             Background reading: this tutorial by David MacKay.

Week 7 (24/02/05) Gaussian process Regression & Kernel Centering
                             Required reading: pages 25-30 in this thesis by Joaquin Quinonero Candela
                             Background reading: this tutorial by David MacKay.

Week 8 (01/03/05) Kernel PCA
                             Required reading: -pages chapter 6: pages 140-161 (until sec. 6.4)
                                                            -lectures notes.

Week 8 (03/03/05) Kernel K-means & Relevance Vector Machines
                            
-Joshua has volunteered to give a 15/20 mins. presentation
                                      on the Relevance Vector Machine
                                      (this will not be required material for the final exam).
                             - Kernel K-means. Required reading: (8.2.2 pages 273-274).
                                       Do not read the material in section 8.2 before section 8.2.2:
                                       there are serious mistakes in the book
.

Week 9 (08/03/05) Spectral Clustering
                            
-Required reading: page 277-280 on spectral clustering.
                             -Proposition 6.12 pages 148-150.

Week 9 (10/03/05) Kernel Design
                            
-Required reading: chapter 9 until (including) section 9.4.
                                                             chapter 11 until (including) page 362.
                                                             chapter 12 (all)
                             You only need to know these chapter at the level of detail treated
                             in class (i.e. the details of the implementation is not required material). 

Week 10 (14/03/05) Demos by Students

 

Leftover class-note: Kernel Canonical Correlation Analysis   

 

 

 


Matrix Cookbook

This manuscript may help you with some elementary algebra.

                               


MATLAB Demos: 
Regularization path for SVM (Trevor Hastie – Stanford)
http://svm.dcs.rhbnc.ac.uk/pagesnew/GPat.shtml (cool demo for SVMs)

 


Syllabus:

The course will primarily be lecture-based with homework, a project and an exam.
Some homework is likely to include a lab component, i.e., simulation, exploration,
and visualization of kernel methods using a software tool such as Mathematica or MATLAB.

Some topics that we expect to cover include:

1. Kernel Theory:
- Pattern Analysis
- Feature spaces & embeddings
- Kernel trick
- Gram matrices
- Characterization of kernels – Mercer’s theorem
- Over-fitting & bounds on generalization
- Large Margin classification

2. Algorithms :
- Introduction to convex optimization
- Outlier detection
- Support vector machines
- Kernel Fisher Discriminant Analysis
- Kernel principal components analysis
- Kernel canonical correlation analysis
- Kernel ridge regression
- Support vector regression
- Gaussian Processes
- k-means clustering & Spectral clustering

3. Constructing Kernels :
- Combining kernels
- Anova kernels
- String/tree kernels
- Graph kernels
- Diffusion kernels
- Kernels for text
- Marginal kernels Fisher kernels


Grading Criteria


Grading will be based on a combination of :
1. Weekly homework (reading material/exercises) which may be tested with  an occasional quiz (20%).
2. A small project where the student is asked to implement and test a kernel machine (30%).
3. A final exam (50%).


Textbook

Main book used in the class:
1. John Shawe-Taylor & Nello Cristianini:
    Kernel Methods for Pattern Recognition.
    Cambridge University Press, 2004

Another very good reference is:
2.  Bernard Schoelkopf and Alex Smola,
      Learning with Kernels
      MIT Press: 2002