Accepted Tutorials

 

Max Welling

welling@ics.uci.edu

Bren School of Information and Computer Science
Irvine, CA 92697-3425
phone: (949) 824 8169
fax: (949) 824 4056

 

 ACCEPTED TUTORIALS

 

Mining Large Time Evolving Data Using Matrix and Tensor Tools

C. Faloutsos (CMU),

T.G. Kolda (Sandia National Labs)

J. Sun (CMU)

http://www.cs.cmu.edu/~christos/TALKS/ICML-07-tutorial/

How can we find patterns in sensor streams (eg., a sequence of temperatures, water-pollutant measurements, or machine room measurements)? How can we mine Internet traffic graph over time? Further, how can we make the process incremental? We review the state of the art in four related fields: (a) numerical analysis and linear algebra (b) multi-linear/tensor analysis (c) graph mining and (d) stream mining. We will present both theoretical results and algorithms as well as case studies on several real applications. Our emphasis is on the intuition behind each method and on guidelines for the practitioner.


Relational Data Community Generation

B. Long (SUNY Binghamton)

Z. Zhang (SUNY Binghamton)

P.S. Yu (IBM Thomas J. Watson Research Center)

http://www.fortune.binghamton.edu/icml2007tutorial.html

Relational data community generation is concerned with learning community structures from relational data which involve rich collections of objects linked together in complex relational networks. Relational data community generation is a recently emerged hot topic in machine learning and data mining research, and solutions developed in the research hold substantial impacts in many important applications. A few examples of the important applications include Web community mining, social network mining, and law enforcement activities. This tutorial will provide an introduction to the theory, practice, and open problems in relational data community generation. Among the topics that will be covered are formulation and representation of relational data; the graph approximation model; the matrix approximation model; spectral approaches; semi-definite programming approaches; probabilistic models; a unified theory; and applications. Attendees will get a broad overview of both the theory and practice of relational data community generation, with numerous examples of how both impact real-world applications.

Practical Statistical Relational Learning

P. Domingos (University of Washington)

http://www.cs.washington.edu/homes/pedrod/psrl.html


Statistical relational learning (SRL) focuses on learning when samples are non-i.i.d. (independent and identically distributed). Domains where data is non-i.i.d. are widespread; examples include Web search, informatio extraction, perception, medical diagnosis/epidemiology, molecular and systems biology, social science, security, ubiquitous computing, and others. In all of these domains, modeling dependencies between examples can greatly improve predictive performance, and lead to better understanding of the relevant phenomena. However, doing this can be much more complex than treating examples independently. The goal of this tutorial is to provide researchers and practitioners with the tools needed to learn from interdependent examples with no more difficulty than they learn fromisolated examples today. We begin with an introduction to the four foundational areas of SRL: logical inference, inductive logic programming, probabilistic inference, and statistical learning. We then show how to combine them in a principled and efficient way and survey major approaches, using Markov logic as the foundation. Finally, we show how to apply these techniques to a wide variety of problems, using the Alchemy open-source software as a concrete tool.


Group Theoretical Methods in Machine Learning

R. Kondor (Columbia University)

http://www1.cs.columbia.edu/~risi/ICMLtutorial/index.html


Abstract Algebra has a powerful set of tools to offer Machine Learning, but the interface of these two fields is still very new. It is already clear that ideas from Non-commutative Harmonic Analysis have natural applications in capturing invariances, describing distributions over permutations and trees, and addressing ranking problems. The tutorial will provide a gentle introduction to the mathematical background to this emerging field (mainly Group Theory and Representation Theory), and explore further connections between Algebra and Learning.

Tensor Methods for Machine Learning, Computer Vision and Computer Graphics

M.A.O. Vasilescu (MIT)

A. Shashua (Hebrew University of Jerusalem)

www.media.mit.edu/~maov/classes/icml2007/

Tensor factorizations of higher order tensors have been successfully
applied in numerous machine learning, vision, graphics and signal
processing tasks in recent years and are drawing a lot of attention.
There are two types of higher order tensor decompositions which
generalize different concepts of the matrix SVD, the rank-R
decomposition (open problem) and Rank-(R1,R2,...,RM) decomposition,
and various Tensor factorizations under convex constraints relevant to
classical inference and clustering tasks.

In the first part of the tutorial, we will define the linear tensor
rank, rank-R and the multilinear tensor rank, rank-(R1,R2,...,RM). The
linear tensor rank, rank-R, generalizes the matrix concept of rank,
while the multilinear rank, rank-(R1,R2,...,RM), generalizes the
matrix concepts of orthonormal row/column subspaces. We will discuss
several multilinear representations, Multilinear PCA, Multilinear ICA,
etc. and introduce the multilinear projection operator, tensor
pseudo-inverse and the identity tensor which are important in
performing recognition in a tensor framework. Furthermore, we will
discuss why images have been traditionally vectorized in statistical
learning, and discuss the advantages and disadvantages of treating
images as vectors, matrices and higher order objects in the context of
a tensor framework. In the second part this tutorial, we will discuss
factorizations relevant to statistical inference and clustering where
orthogonality are replaced by convex constraints. We will address
general low-rank tensors (arise form latent class models),
super-symmetric (arise from clustering over hypergraphs) and
semi-symmetric (arise from latent clustering models) and introduce
factorization algorithms.

The tutorial will cover the application of these techniques to
compression, face recognition, multi-object detection in supervised
and unsupervised settings, gait recognition, and computer graphics.

Bayesian Methods in Reinforcement Learning

P. Poupart (University of Waterloo)

M. Ghavamzadeh (University of Alberta)

Y. Engel (University of Alberta)

http://www.cs.uwaterloo.ca/~ppoupart/ICML-07-tutorial-Bayes-RL.html

Although Bayesian methods for reinforcement learning can be traced
back to the 1960s (Howard's work in Operations Research), Bayesian
methods have only been used sporadically in modern reinforcement
learning.  This is in part because non-Bayesian approaches tend to be
much simpler to work with.  However, recent advances have shown that
Bayesian approaches do not need to be as complex as initially thought
and offer several theoretical advantages.  For instance, by keeping
track of full distributions (instead of point estimates) over the
unknowns, Bayesian approaches permit a more comprehensive
quantification of the uncertainty regarding the transition
probabilities, the rewards, the value function parameters and the
policy parameters. Such distributional information can be used to
optimize (in a principled way) the classic exploration/exploitation
tradeoff, which can speed up the learning process.  Similarly, active
learning for reinforcement learning can be naturally optimized.  The
estimation of gradient performance with respect to value function
and/or policy parameters can also be done more accurately while using
less data.  Bayesian approaches also facilitate the encoding of prior
knowledge and the explicit formulation of domain assumptions.

This tutorial will provide an introduction to Bayesian learning,
followed by a historical account of Bayesian reinforcement learning
and a description of existing Bayesian methods for reinforcement
learning (including Gaussian processes, Bayesian gradient descent as
well as model-based Bayesian approaches for plain reinforcement
learning, imitation learning and multi-agent coordination in
reinforcement learning).  The properties and benefits of Bayesian
techniques for reinforcement learning will be discussed, analyzed and
illustrated with case studies.


Online Learning of Real-World Problems

K. Crammer (University of Pennsylvania)
http://www.cis.upenn.edu/~crammer/icml-tutorial-index.html


Most of the research in machine learning has been directed to the
problem of binary classification in which the learned classifier outputs
one of two possible answers. This is a fundamental problem, but still it
does not fit well important real-world applications. In this tutorial we
will focus on more complex settings in which there are many possible
answers with complex preference relationships among them. Notable
examples include multi-class categorization, hierarchical
classification, and sequence prediction. We will use the algorithmic
framework of online learning for several reasons. First, in general
online algorithms are conceptually simple and easy to implement.
Furthermore, online algorithms process one example at a time. Thus, such
methods are appealing for large data sets. Second, online algorithms
have been used in practice for the applications that we will use as
examples. Third, the analysis of these algorithms is based on
mathematical tools which are simpler than those needed for analyzing
other types of algorithms. We have two goals of the tutorial. First, to
provide the audience systematic tools to design and analyze learning
algorithms for their specific complex problems: from binary
classification through multi-class categorization, to sequence
prediction. Second, to introduce new online algorithms which provide
state-of-the-art performance in practice, and are accompanied with
theoretical guarantees.


Semi-Supervised Learning

X. Zhu (University of Wisconsin-Madison)
http://www.cs.wisc.edu/~jerryzhu/icml07tutorial.html

Why can we learn from unlabeled data for supervised learning tasks? Do
unlabeled data always help? What are the popular semi-supervised learning
methods, and how do they work? How do they relate to each other? What are
the research trends? In this tutorial we address these questions. We will
examine state-of-the-art methods, including generative models, multiview
learning (e.g., co-training), graph-based learning (e.g., manifold
regularization), transductive SVMs and so on. We also offer some advice
for practitioners. Finally we discuss the connection between
semi-supervised machine learning and natural learning. The emphasis of the
tutorial is on the intuition behind each method, and the assumptions they
need.