CS 277: Data Mining
Spring 2013
Instructor: David Newman
Lectures: 12:30 to 1:50pm, Tuesdays and Thursdays, DBH 1500
Office Hours: Mondays, 11:00am - noon, DBH 4064
Reader: Sridevi Maharaj, sridevi.m@uci.edu
Message Board: https://eee.uci.edu/toolbox/messageboard/m13971/f41173/
Software: information about MATLAB
Background Reading: additional papers to supplement the text
Project Guidelines, including due dates, instructions, and links to data sets
Schedule
New
Homework 2 (due in class May 21) *** Please ignore 2010 due date in pdf!
Homework 2 info
Homework 2 first article to read
Homework 2 second article to read
Homework 2 third article to read
(follows CS 277 Syllabus from Prof. Smyth. Below syllabus subject to change.)
Introduction to Data Mining:
- Topics
- basic concepts in data mining
- data measurement
- exploratory data analysis
- data visualization
- Reading
- Links
- Slides
- Introduction to Data Mining [PPT] [PDF]
- Measurement and Data [PPT] [PDF]
- Exploratory Data Analysis and Visualization [PPT] [PDF]
- Homeworks
Basic Principles of Data Mining
- Topics
- predictive modeling: classification and regression
- model fitting as optimization
- evaluation of predictive performance
- overfitting, regularization
- other data mining tasks: clustering amd pattern detection
- Reading
- Links
- Slides
Text Mining
- Topics
- information retrieval and search
- text classification
- unsupervised learning
- Reading
- Links
- Slides
- text classification [PPT] [PDF]
- text mining and topic models [PPT] [PDF]
- notes on graphical models [PPT] [PDF]
Recommender Systems
- Topics
- recommender data, Netflix prize data
- nearest neighbor algorithms
- matrix decomposition algorithms
- efficient algorithms for large data sets
- modeling systematic effects
- Reading
- Links
- Slides
- Recommender systems [PPT] [PDF]
- Netflix case study [PPT] [PDF]
Web Data Analysis
- Topics
- Web data: collection and interpretation
- analyzing user browsing behavior
- learning from clickthrough data
- predictive modeling and online advertising
- link analysis and the PageRank algorithm
- Reading
- Links
- Slides
Social Network Analysis
- Topics
- descriptive analysis of social networks
- network embedding and latent space models
- network data over time: dynamics and event-based networks
- link prediction
- Reading
- overview of network analysis methods (e.g., Newman et al)
- overview of SNA concepts
- review paper on community detection in networks
- specific papers by Watts, Leskovecs, Barabasi, etc
- Links
- Slides
- Week 1:
- Tue Apr 2: Read this Pedro Domingos paper
- Thu Apr 4:
- Week 2:
- Tue Apr 9:
- Thu Apr 11: Project proposals due in class
- Week 3:
- Tue Apr 16:
- Thu Apr 18: HW1 due in class
- Week 4:
- Week 5:
- Tue Apr 30: Project progress presentation in class
- Thu May 2: Project progress presentation in class
- Week 6:
- Week 7:
- Mon May 13: Office hours cancelled
- Tue May 14: ***NO LECTURE*** Continue work on project
- Thu May 16:
- Week 8:
- Tue May 21: Scientific writing. HW2 DUE IN CLASS
- Thu May 23: Scientific writing
- Week 9:
- Tue May 28: Project presentations in class
- Thu May 30: Project presentations in class
- Week 10:
- Tue June 4: Project presentations in class
- Thu June 6: Project presentations in class; Final project report due in class
Grading
Your class grade will be based on a class project (70%) and two
homeworks (total of 30%, each HW is worth 15%). The projects will
require submission of a progress report during the quarter, a
presentation in class during Weeks 9 or 10, and a final report.
Academic Honesty
It is the responsibility of each student to be familiar with the UCI Senate Academic Honesty Policies. For homework assignments and projects you are allowed to discuss ideas and concepts verbally with other class members, but you are not allowed to look at or copy anyone else's written solutions or code relating to homework assignments or projects. All material submitted must be material you have personally written during this quarter. Failure to adhere to this policy can result in a student receiving a failing grade in the class.