Abstracts
| Circadian Clock Genes Contribute to the Regulation of Hair Follicle Cycling | |
The hair follicle renews itself by repeatedly cycling among growth, regression,
and rest phases. One function of hair follicle cycling is to allow seasonal
changes in hair growth. Understanding the regulation of hair follicle cycling
is also of interest because abnormal regulation of hair cycle control genes is
responsible for several types of human hair growth disorders and skin cancers.
We report here that Clock and Bmal1 genes,
which control circadian rhythms, are also important for the regulation of hair
follicle cycling, a biological process of much longer duration than 24 hours.
Detailed analysis of skin from mice mutated for central clock genes indicates a
significant delay in the progression of the hair growth phase. We show that
clock genes affect the expression of key cell cycle control genes and that
keratinocytes in a critical compartment of the hair follicles in
Bmal1 mutant mice are halted in the G1 phase of the cell
cycle. These findings provide novel insight into circadian control mechanisms
in modulating the progression of cyclic biological processes on different time
scales.
| Particle Belief Propagation | |
The popularity of particle filtering for inference in Markov chain
models defined over random variables with very large or continuous
domains makes it natural to consider sample--based versions of belief
propagation (BP) for more general (tree--structured or loopy) graphs.
Already, several such algorithms have been proposed in the
literature. However, many questions remain open about the behavior
of particle--based BP algorithms, and little theory
has been developed to analyze their performance.
In this paper, we describe a generic particle belief propagation (PBP)
algorithm which is closely related to previously proposed methods.
We prove that this algorithm is consistent, approaching the true BP
messages as the number of samples grows large.
We then use concentration bounds to analyze the finite-sample
behavior and give a convergence rate for the algorithm on
tree--structured graphs. Our convergence rate is $O(1/\sqrt{n})$
where $n$ is the number of samples, independent of the domain size
of the variables.
| Adaptive Updates for MAP Configurations with Applications to Bioinformatics | |
Many applications involve repeatedly computing the optimal, maximum
a posteriori (MAP) configuration of a graphical model as the
model changes, often slowly or incrementally over time, e.g., due to
input from a user.
Small changes to the model often require
updating only a small fraction of the MAP configuration, suggesting
the possibility of performing updates faster than recomputing from
scratch.
In this paper we present an algorithm for efficiently performing such
updates under arbitrary changes to the model.
Our algorithm is within a logarithmic factor of the optimal and is
asymptotically never slower than re-computing from-scratch: if a
modification to the model requires $m$ updates to the MAP
configuration of $n$ random variables, then our algorithm requires
$O(m\log{(n/m)})$ time; re-computing from scratch requires $O(n)$
time.
We evaluate the practical effectiveness of our algorithm by
considering two problems in genomic signal processing, CpG region
segmentation and protein sidechain packing, where a MAP configuration
must be repeatedly updated. Our results show significant speedups
over recomputing from scratch.
| Fast Collapsed Gibbs Sampling for Latent Dirichlet Allocation | |
In this paper we introduce a novel collapsed Gibbs sampling method for the
widely used latent Dirichlet allocation (LDA) model. Our
new method results in significant speedups on real world text
corpora. Conventional Gibbs sampling schemes for LDA require
O(K) operations per sample where K is the number of
topics in the model. Our proposed method draws equivalent
samples but requires on average significantly less then K operations per sample.
On real-word corpora FastLDA can be as much as 8
times faster than the standard collapsed Gibbs sampler for LDA. No approximations are
necessary, and we show that our fast sampling
scheme produces exactly the same results as the standard
(but slower) sampling scheme. Experiments on four real world
data sets demonstrate speedups for a wide range of
collection sizes. For the PubMed collection of over 8 million
documents with a required computation time of 6 CPU months for LDA,
our speedup of 5.7 can save 5 CPU months of computation.
| Adaptive Inference in General Graphical Models | |
Many algorithms and applications involve repeatedly solving
variations of the same inference problem; for example we may want to
introduce new evidence to the model or perform updates to
conditional dependencies. The goal of \emph{adaptive inference} is
to take advantage of what is preserved in the model and perform
inference more rapidly than from scratch. In this paper, we
describe techniques for adaptive inference on general graphs that
support marginal computation and updates to the conditional
probabilities and dependencies in logarithmic time. We give
experimental results for an implementation of our algorithm, and
demonstrate its potential performance benefit in the study of
protein structure.
| Probabilistic Analysis of a Large Scale Urban Traffic Sensor Data Set | |
Real-world sensor time series are often significantly noisier
and more difficult to work with than the relatively clean
data sets that tend to be used as the basis for experiments
in many research papers. In this paper we report on a large
case-study involving statistical data mining of over 300 million
measurements from 1700 freeway traffic sensors over a
period of seven months in Southern California. We discuss
the challenges posed by the wide variety of different sensor
failures and anomalies present in the data. The volume and
complexity of the data precludes the use of manual
visualization or simple thresholding techniques to identify these
anomalies. We describe the application of probabilistic
modeling and unsupervised learning techniques to this data set
and illustrate how these approaches can successfully detect
underlying systematic patterns even in the presence of
substantial noise and missing data
| Graphical Models for Statistical Inference and Data Assimilation | |
In data assimilation for a system which evolves in time, one combines past and current observations with a model of the dynamics of the system, in order to improve the simulation of the system as well as any future predictions about it. From a statistical point of view, this process can be regarded as estimating many random variables which are related both spatially and temporally: given observations of some of these variables, typically corresponding to times past, we require estimates of several others, typically corresponding to future times.
Graphical models have emerged as an effective formalism for assisting in these types of inference tasks, particularly for large numbers of
random variables. Graphical models provide a means of representing dependency structure among the variables, and can provide both intuition
and efficiency in estimation and other inference computations. We provide an overview and introduction to graphical models, and describe how
they can be used to represent statistical dependency and how the resulting structure can be used to organize computation. The relation between
statistical inference using graphical models and optimal sequential estimation algorithms such as Kalman filtering is discussed. We then give
several additional examples of how graphical models can be applied to climate dynamics, specifically estimation using multi-resolution models of
large-scale data sets such as satellite imagery, and learning hidden Markov models to capture rainfall patterns in space and time.
| Learning to detect events with Markov-modulated Poisson processes | |
Time-series of count data occur in many different contexts, including Internet navigation logs, freeway traffic monitoring, and security logs associated with buildings. In this article we describe a framework for detecting anomalous events in such data using an unsupervised learning approach. Normal periodic behavior is modeled via a time-varying Poisson process model, which in turn is modulated by a hidden Markov process that accounts for bursty events. We outline a Bayesian framework for learning the parameters of this model from count time-series. Two large real-world datasets of time-series counts are used as testbeds to validate the approach, consisting of freeway traffic data and logs of people entering and exiting a building. We show that the proposed model is significantly more accurate at detecting known events than a more traditional threshold-based technique. We also describe how the model can be used to investigate different degrees of periodicity in the data, including systematic day-of-week and time-of-day effects, and to make inferences about different aspects of events such as number of vehicles or people involved. The results indicate that the Markov-modulated Poisson framework provides a robust and accurate framework for adaptively and autonomously learning how to separate unusual bursty events from traces of normal human activity.
[ BibTex ]
| Adaptive Bayesian Inference | |
Motivated by stochastic systems in which observed evidence and
conditional dependencies between states of the network change over
time, and certain quantities of interest (marginal distributions,
likelihood estimates etc.) must be updated, we study the problem of
\emph{adaptive inference} in tree-structured Bayesian networks. We
describe an algorithm for adaptive inference that handles a broad
range of changes to the network and is able to maintain marginal
distributions, MAP estimates, and data likelihoods in all expected
logarithmic time. We give an implementation of our algorithm and
provide experiments that show that the algorithm can yield up to two
orders of magnitude speedups on answering queries and responding to
dynamic changes over the sum-product algorithm.
| Accuracy Bounds for Belief Propagation | |
The belief propagation algorithm is widely applied to perform
approximate inference on arbitrary graphical models, in part due to
its excellent empirical properties and performance. However, little
is known theoretically about when this algorithm will perform well.
Using recent analysis of convergence and stability properties in
belief propagation and new results on approximations in binary
systems, we derive a bound on the error in BP's estimates for
pairwise Markov random fields over discrete--valued random variables.
Our bound is relatively simple to compute, and compares favorably
with a previous method for bounding the accuracy of
belief propagation.
| Modeling Count Data from Multiple Sensors: A Building Occupancy Model | |
Knowledge of the number of people in a building at a given time is
crucial for applications such as emergency response. Sensors can be
used to gather noisy measurements which when combined, can be used
to make inferences about the location, movement and density of
people. In this paper we describe a probabilistic model for
predicting the occupancy of a building using networks of
people-counting sensors. This model provides robust predictions
given typical sensor noise as well as missing and corrupted data
from malfunctioning sensors. We experimentally validate the model
by comparing it to a baseline method using real data from a network
of optical counting sensors in a campus building.
| Distributed Fusion in Sensor Networks | |
Distributed inference methods developed for graphical models comprise a principled
approach for data fusion in sensor networks. The application of these methods, however,
requires some care due to a number of issues that are particular to sensor networks.
Chief of among these are the distributed nature of computation and
deployment coupled with communications bandwidth and energy constraints typical
of many sensor networks. Additionally, information sharing in a sensor network necessarily
involves approximation. Traditional measures of distortion are not sufficient to characterize the
quality of approximation as they do not address in an explicit manner the resulting impact on inference
which is at the core of many data fusion problems. While both graphical models and a distributed
sensor network have network structures associated with them, the mapping is not one to one.
All of these issues complicate the mapping of a particular inference problem to a given sensor network
structure. Indeed, there may be a variety of mappings with very different characteristics with
regard to computational complexity and utilization of resources. Nevertheless, it is the case that
many of the powerful distributed inference methods have a role in information fusion for sensor
networks. In this article we present an overview of research conducted by the authors that has
sought to clarify many of the important issues at the intersection
of these domains. We discuss both theoretical issues and prototypical
applications in addition to suggesting new lines of reasoning.
| Gibbs Sampling for (Coupled) Infinite Mixture Models in the Stick Breaking Representation | |
Nonparametric Bayesian approaches to clustering,
information retrieval, language modeling
and object recognition have recently shown great
promise as a new paradigm for unsupervised data
analysis. Most contributions have focused on the
Dirichlet process mixture models or extensions
thereof for which efficient Gibbs samplers exist.
In this paper we explore Gibbs samplers for
infinite complexity mixture models in the stick
breaking representation. The advantage of this
representation is improved modeling flexibility.
For instance, one can design the prior distribution
over cluster sizes or couple multiple infi-
nite mixture models (e.g., over time) at the level
of their parameters (i.e., the dependent Dirichlet
process model). However, Gibbs samplers for in-
finite mixture models (as recently introduced in
the statistics literature) seem to mix poorly over
cluster labels. Among others issues, this can have
the adverse effect that labels for the same cluster
in coupled mixture models are mixed up. We
introduce additional moves in these samplers to
improve mixing over cluster labels and to bring
clusters into correspondence. An application to
modeling of storm trajectories is used to illustrate
these ideas.
| Adaptive Event Detection with Time-Varying Poisson Processes | |
Time-series of count data are generated in many different contexts, such as web access logging, freeway traffic monitoring, and security logs associated with buildings. Since this data measures the aggregated behavior of individual human beings, it typically exhibits a periodicity in time on a number of scales (daily, weekly,etc.) that reflects the rhythms of the underlying human activity and makes the data appear non-homogeneous. At the same time, the data is often corrupted by a number of bursty periods of unusual behavior such as building events, traffic accidents, and so forth. The data mining problem of finding and extracting these anomalous events is made difficult by both of these elements. In this paper we describe a framework for unsupervised learning in this context, based on a time-varying Poisson process model that can also account for anomalous events. We show how the parameters of this model can be learned from count time series using statistical estimation techniques. We demonstrate the utility of this model on two datasets for which we have partial ground truth in the form of known events, one from freeway traffic data and another from building access data, and show that the model performs significantly better than a non-probabilistic, threshold-based technique. We also describe how the model can be used to investigate different degrees of periodicity in the data, including systematic day-of-week and time-of-day effects, and make inferences about the detected events (e.g., popularity or level of attendance). Our experimental results indicate that the proposed time-varying Poisson model provides a robust and accurate framework for adaptively and autonomously learning how to separate unusual bursty events from traces of normal human activity.
| Learning Time-Intensity Profiles of Human Activity Using Nonparametric Bayesian Models | |
Data sets that characterize human activity over time through collections of timestamped
events or counts are of increasing interest in application areas as humancomputer
interaction, video surveillance, and Web data analysis. We propose a
non-parametric Bayesian framework for modeling collections of such data. In
particular, we use a Dirichlet process framework for learning a set of intensity
functions corresponding to different categories, which form a basis set for representing
individual time-periods (e.g., several days) depending on which categories
the time-periods are assigned to. This allows the model to learn in a data-driven
fashion what factors are generating the observations on a particular day, including
(for example) weekday versus weekend effects or day-specific effects corresponding
to unique (single-day) occurrences of unusual behavior, sharing information
where appropriate to obtain improved estimates of the behavior associated
with each category. Applications to realworld data sets of count data involving
both vehicles and people are used to illustrate the technique.
| Loopy Belief Propagation: Convergence and Effects of Message Errors | |
Belief propagation (BP) is an increasingly popular method of performing
approximate inference on arbitrary graphical models. At times, even
further approximations are required, whether due to quantization of the
messages or model parameters, from other simplified message or model
representations, or from stochastic approximation methods. The
introduction of such errors into the BP message computations has the
potential to affect the solution obtained adversely. We analyze the
effect resulting from message approximation under two particular measures
of error, and show bounds on the accumulation of errors in the system.
This analysis leads to convergence conditions for traditional BP message
passing, and both strict bounds and estimates of the resulting error in
systems of approximate BP message passing.
| Nonparametric Belief Propagation for Sensor Network Self-Calibration | |
Automatic self-localization is a critical need for the effective use of
ad-hoc sensor networks in military or civilian applications. In general,
self-localization involves the combination of absolute location
information (\eg GPS) with relative calibration information (\eg distance
measurements between sensors) over regions of the network. Furthermore, it
is generally desirable to distribute the computational burden across the
network and minimize the amount of inter-sensor communication. We
demonstrate that the information used for sensor localization is
fundamentally local with regard to the network topology and use this
observation to reformulate the problem within a graphical model framework.
We then present and demonstrate the utility of \emph{nonparametric belief
propagation} (NBP), a recent generalization of particle filtering, for
both estimating sensor locations and representing location uncertainties.
NBP has the advantage that it is easily implemented in a distributed
fashion, admits a wide variety of statistical models, and can represent
multi-modal uncertainty. Using simulations of small- to moderately-sized
sensor networks, we show that NBP may be made robust to outlier
measurement errors by a simple model augmentation, and that judicious
message construction can result in better estimates. Furthermore, we
provide an analysis of NBP's communications requirements, showing that
typically only a few messages per sensor are required, and that even low
bit-rate approximations of these messages can have little or no
performance impact.
| Particle Filtering Under Communications Constraints | |
Particle filtering is often applied to the problem of object tracking
under non-Gaussian uncertainty; however, sensor networks frequently
require that the implementation be local to the region of interest,
eventually forcing the large, sample-based representation to be moved
among power-constrained sensors. We consider the problem of successive
approximation (i.e., lossy compression) of each sample-based density
estimate, in particular exploring the consequences (both theoretical and
empirical) of several possible choices of loss function and their
interpretation in terms of future errors in inference, justifying their
use for measuring approximations in distributed particle filtering.
| Estimating Dependency and Significance for High-Dimensional Data | |
Understanding the dependency structure of a set of variables is a
key component in various signal processing applications which involve
data association. The simple task of detecting whether any
dependency exists is particularly difficult when models of the data
are unknown or difficult to characterize because of high-dimensional
measurements. We review the use of nonparametric tests for characterizing
dependency and how to carry out these tests with highdimensional
observations. In addition we present a method to assess
the significance of the tests.
| Nonparametric Hypothesis Tests for Statistical Dependency | |
Determining the structure of dependencies among a set of variables is a
common task in many signal and image processing applications, including
multi-target tracking and computer vision. In this paper we present an
information-theoretic, machine learning approach to problems of this type.
We cast this problem as a hypothesis test between factorizations of
variables into mutually independent subsets. We show that the likelihood
ratio can be written as sums of two sets of Kullback-Leibler (KL)
divergence terms. The first set captures the structure of the statistical
dependencies within each hypothesis, while the second set measures the
details of model differences between hypotheses. We then consider the
case when the signal prior models are unknown, so that the distributions
of interest must be estimated directly from data, showing that the second
set of terms is (asymptotically) negligible and quantifying the loss in
hypothesis separability when the models are completely unknown. We
demonstrate the utility of nonparametric estimation methods for such
problems, providing a general framework for determining and distinguishing
between dependency structures in highly uncertain environments.
Additionally, we develop a machine learning approach for estimating lower
bounds on KL-divergence and mutual information from samples of
high-dimensional random variables for which direct density estimation is
infeasible. We present empirical results in the context of three
prototypical applications: association of signals generated by sources
possessing harmonic behavior, scene correspondence using video imagery,
and detection of coherent behavior among sets of moving objects.
| Message Errors in Belief Propagation | |
Belief propagation (BP) is an increasingly popular method of performing
approximate inference on arbitrary graphical models. At times, even
further approximations are required, whether from quantization or other
simplified message representations or from stochastic approximation
methods. Introducing such errors into the BP message computations has
the potential to adversely affect the solution obtained. We analyze this
effect with respect to a particular measure of message error, and show
bounds on the accumulation of errors in the system. This leads both to
convergence conditions and error bounds in traditional and approximate BP
message passing.
| Nonparametric Belief Propagation for Sensor Network Self-Calibration | |
Automatic self-calibration of ad-hoc sensor networks is a critical need
for their use in military or civilian applications. In general,
self-calibration involves the combination of absolute location information
(e.g. GPS) with relative calibration information (e.g. estimated distance
between sensors) over regions of the network. We formulate the
self-calibration problem as a graphical model, enabling application of
nonparametric belief propagation (NBP), a recent generalization of
particle filtering, for both estimating sensor locations and representing
location uncertainties. NBP has the advantage that it is easily
implemented in a distributed fashion, can represent multi-modal
uncertainty, and admits a wide variety of statistical models. This last
point is particularly appealing in that it can be used to provide
robustness against occasional high-variance (outlier) noise. We
illustrate the performance of NBP using Monte Carlo analysis on an example
network.
| Nonparametric Belief Propagation for Self-Calibration in Sensor Networks | |
Automatic self-calibration of ad-hoc sensor networks is a critical need
for their use in military or civilian applications. In general,
self-calibration involves the combination of absolute location information
(e.g. GPS) with relative calibration information (e.g. time delay or
received signal strength between sensors) over regions of the network.
Furthermore, it is generally desirable to distribute the computational
burden across the network and minimize the amount of inter-sensor
communication. We demonstrate that the information used for sensor
calibration is fundamentally local with regard to the network topology and
use this observation to reformulate the problem within a graphical model
framework. We then demonstrate the utility of \emph{nonparametric belief
propagation} (NBP), a recent generalization of particle filtering, for
both estimating sensor locations and representing location uncertainties.
NBP has the advantage that it is easily implemented in a distributed
fashion, admits a wide variety of statistical models, and can represent
multi-modal uncertainty. We illustrate the performance of NBP on several
example networks while comparing to a previously published nonlinear least
squares method.
| Communications-Constrained Inference | |
In many applications, particularly power-constrained sensor networks, it is important to conserve the amount of data exchanged while maximizing the utility of that data for some inference task. Broadly, this tradeoff has two major cost components—the representation’s size (in distributed networks, the communications cost) and the error incurred by its use (the inference cost).
We analyze this tradeoff for a particular problem: communicating a particle-based representation (and more generally, a Gaussian mixture or kernel density estimate). We begin by characterizing the exact communication cost of these representations, noting that it is less than might be suggested by traditional communications theory due to the invariance of the represen- tation to reordering. We describe the optimal, lossless encoder when the generating distribution is known, and pose a sub-optimal encoder which still benefits from reordering invariance.
However, lossless encoding may not be sufficient. We describe one reasonable measure of
error for distribution-based messages and its consequences for inference in an acyclic network,
and propose a novel density approximation method based on KD-tree multiscale representations
which enables the communications cost and a bound on error to be balanced efficiently. We
show several empirical examples demonstrating the method’s utility in collaborative, distributed
signal processing under bandwidth or power constraints.
| Nonparametric Belief Propagation | |
In many applications of graphical models arising in computer vision, the hidden variables of interest
are most naturally specified by continuous, non-Gaussian distributions. There exist inference algorithms
for discrete approximations to these continuous distributions, but for the high-dimensional variables
typically of interest, discrete inference becomes infeasible. Stochastic methods such as particle
filters provide an appealing alternative. However, existing techniques fail to exploit the rich
structure of the graphical models describing many vision problems. Drawing on ideas from regularized
particle filters and belief propagation (BP), this paper develops a nonparametric belief propagation
(NBP) algorithm applicable to general graphs. Each NBP iteration uses an efficient sampling procedure
to update kernel-based approximations to the true, continuous likelihoods. The algorithm can accomodate
an extremely broad class of potential functions, including nonparametric representations. Thus, NBP
extends particle filtering methods to the more general vision problems that graphical models can
describe. We apply the NBP algorithm to infer component interrelationships in a parts-based face model,
allowing location and reconstruction of occluded features.
| Efficient Multiscale Sampling from Products of Gaussian Mixtures | |
The problem of approximating the product of several Gaussian mixture
distributions arises in a number of contexts, including the nonparametric
belief propagation (NBP) inference algorithm and the training of
product of experts models.
This paper develops two multiscale algorithms for sampling from a
product of Gaussian mixtures, and compares their performance to existing
methods. The first is a multiscale variant of previously proposed
Monte Carlo techniques, with comparable
theoretical guarantees but improved empirical convergence rates.
The second makes use of approximate kernel density evaluation
methods to construct a fast approximate sampler, which is
guaranteed to sample points to within a tunable parameter $\epsilon$
of their true probability.
We compare both multiscale samplers on a set of computational examples
motivated by NBP, demonstrating significant improvements over existing
methods.
| Hypothesis Testing over Factorizations for Data Association | |
The issue of data association arises frequently in sensor
networks; whenever multiple sensors and sources are present, it
may be necessary to determine which observations from different
sensors correspond to the same target. In highly uncertain
environments, one may need to determine this correspondence
without the benefit of an \emph{a priori} known joint
signal/sensor model. This paper examines the data association
problem as the more general hypothesis test between factorizations
of a single, learned distribution. The optimal test between known
distributions may be decomposed into model-dependent and
statistical dependence terms, quantifying the cost incurred by
model estimation from measurements compared to a test between
known models. We demonstrate how one might evaluate a two-signal
association test efficiently using kernel density estimation
methods to model a wide class of possible distributions, and show
the resulting algorithm's ability to determine correspondence in
uncertain conditions through a series of synthetic examples. We
then describe an extension of this technique to multi-signal
association which can be used to determine correspondence while
avoiding the computationally prohibitive task of evaluating all
hypotheses. Empirical results of the approximate approach are
presented.
| Nonparametric Belief Propagation | |
In applications of graphical models arising in fields such as computer vision,
the hidden variables of interest are most naturally specified by continuous,
non-Gaussian distributions. However, due to the limitations of existing inference
algorithms, it is often necessary to form coarse, discrete approximations to
such models. In this paper, we develop a nonparametric belief propagation
(NBP) algorithm, which uses stochastic methods to propagate kernel-based approximations
to the true continuous messages. Each NBP message update is based on an efficient
sampling procedure which can accomodate an extremely broad class of potential
functions, allowing easy adaptation to new application areas. We validate our
method using comparisons to continuous BP for Gaussian networks, and an application
to the stereo vision problem.
| Nonparametric Estimators for Online Signature Authentication | |
We present extensions to our previous work in modelling dynamical processes.
The approach uses an information theoretic criterion for searching over
subspaces of the past observations, combined with a nonparametric density
characterizing its relation to one-step-ahead prediction and uncertainty.
We use this methodology to model handwriting stroke data, specifically
signatures, as a dynamical system and show that it is possible to learn
a model capturing their dynamics for use either in synthesizing realistic
signatures and in discriminating between signatures and forgeries even
though no forgeries have been used in constructing the model.
This novel approach yields promising results even for small training sets.
| Learning Informative Statistics: A Nonparametric Approach | |
We discuss an information theoretic approach for categorizing and
modeling dynamic processes. The approach can learn a compact and
informative statistic which summarizes past states to predict future
observations. Furthermore, the uncertainty of the prediction is
characterized nonparametrically by a joint density over the learned
statistic and present observation. We discuss the application of the
technique to both noise driven dynamical systems and random
processes sampled from a density which is conditioned on the
past. In the first case we show results in which both the dynamics
of random walk and the statistics of the driving noise are
captured. In the second case we present results in which a
summarizing statistic is learned on noisy random telegraph waves
with differing dependencies on past states. In both cases the
algorithm yields a principled approach for discriminating processes
with differing dynamics and/or dependencies. The method is grounded
in ideas from information theory and nonparametric statistics.
