CS 274A: Background Notes and Reading, Winter 2022
Note that the contents of this page may be updated periodically during the quarter.
Notes
General Background/Review Material on Probability :
 Topics covered: random variables, conditional and joint
probabilities, Bayes rule, law of total probability, chain rule
and factorization. Frequentist and Bayesian views of probability. Sets of random variables, the
multivariate Gaussian model. Conditional independence and graphical
models. Markov models.
 Required Reading: Class Notes 1 and 2 on Probability Concepts and Multivariate Probability Models
 Recommended Additional Reading
 Optional Reading
 Murphy text:
Chapter 1 (introduction), Chapter 2.1 through 2.5 (probability and distributions), Chapter
10.1 to 10.3 (graphical models)
 Excellent 15 minute video on multivariate
Gaussian distributions from our own Alex Ihler
 Chapter from Chris Bishop's book on graphical models (the material on graphical models starts about 20 pages into the document)
Learning from Data using Maximum Likelihood
 Topics: Concepts of models and
parameters. Definition of the likelihood function and the principle of maximum likelihood
parameter estimation. Using maximum likelihood methods to learn the parameters of Gaussian models, binomial,
multivariate and other parametric models.
 Required Reading: Class Notes 3 on Maximum Likelihood
 Recommended Additional Reading
Bayesian Learning
 Topics: General principles of Bayesian estimation: prior densities, posterior densities, MAP,
fully Bayesian approaches. Beta/binomial and Gaussian examples. Predictive densities, model selection, model averaging.
 Required Reading: Review Note Sets 1 and 2 again
 Recommended Reading:
 Optional Additional Reading:
Optimization Methods for Machine Learning
 Topics: General principles of finding minima/maxima of multivariate functions, gradient and Hessian methods,
stochastic gradient methods.
 Recommended Reading:
 Optional Additional Reading:
Regression Models
 Topics: Linear models. Normal equations. Systematic and stochastic components.
Parameter estimation methods for regression. Maximum likelihood and Bayesian
interpretations.
 Required Reading: Class Notes on Regression
 Recommended Reading:
 Optional Additional Reading:
Probabilistic Classification
 Topics: Bayes rule, classification boundaries, discriminant functions, Optimal decisions,
Bayes error rate, Gaussian classifiers. Likelihoodbased approaches and properties of
objective functions. Logistic regression and neural network models.
 Required Reading: Notes on Discriminant Functions and Optimal Classification (PDF). See also Class Notes on Classification, distributed via Piazza.
 Recommended Reading:
 Barber text: pages 229234 (in Chapter 10 on Naive Bayes),
pages 353358 on logistic regression (in Chapter 17 on Linear Models)
 Notes on logistic regression
from Charles Elkan
 Optional Additional Reading:
The EM Algorithm, Mixture Models, and Probabilistic Clustering
StateSpace and TimeSeries Models
 Topics: discrete and continuous latentstate space models. Hidden Markov models,
Kalman filters. Basic principles of smoothing and filtering. Parameter estimation methods using EM.
 Required Reading: Note Set 5 above on Hidden Markov models.

Tutorial paper on latentvariable models for timeseries data, Barber and Cemgil,
IEEE Signal Processing Magazine, 2010.
 Barber text: pages 451471 (in Chapter 23 on Dynamical Models)
 Murphy text: Chapter 17.1 to 17.5
 Sequential modeling using
recurrent neural networks from the Goodfellow et al. text
Sampling Methods