ICS-295 Spring 2007, Reasoning and Modeling with Graphical Models


Class Reference

The seminar will focus on recent advances in graphical models reasoning and knowledge representation, as well as on exploring some application areas. One such area is Genetic Linkage Analysis, which involves both probabilistic information as well as constraints. In 2005 we explored the state of the art in applying Bayesian network algorithms to Linkage analysis. Following a quick overview, we will read recent papers, focusing on issues introduced by the recent availability of Single Nucleotide Polymorphism (SNP) data and the presence of Linkage disequilibrium (LD). Other application areas would be welcomed. We will consider areas such as modeling human behavior and environment in transportation system and the processing of or picture databases.
Each student will be engaged in a research project and will be required to present relevant papers from the literature as well as their own findings. Students will also need to provide a final report for their research project.

The topics we will discuss include:

  • How to handle SNP data in Linkage analysis. How to handle LD (Linkage disequilibrium).
  • Sampling algorithms for linkage analysis (MCMC methods by Elizabeth Thompson. The Morgan system)
  • Cleaning data files: It is often the case in many applications that parts of the data contain typos or was corrupted in various ways. Such problem exists in linkage files. Can this problem be modeled as a graphical model?
  • Exploiting hypergraph structure. In particular, looking into the significance of hypertree width vs treewidth in capturing instance-based complexity of reasoning in graphical models.
  • Approximate and bounding algorithms for posterior marginal (e.g., via belief propagation, sampling and both)
  • Using multiple heuristics during search in graphical models
  • Dynamic Temporal Bayesian networks


RELATED COURSE: Advanced Topics in Bioinformatics. by Dan Geiger.
Useful Linkage-related Links 

Relevant Papers

I. Linkage

  1. Eric S. Lander and Philip Green. Construction of multilocus genetic linkage maps in humans. Genetics. April 1987. PDF
  2. Goncalo R. Abecasis and Janis E. Wigginton. Handling Marker-Marker Linkage Disequilibrium: Pedigree Analysis with Clustered Markers, American Journal of Human Genetics. Sept 2005. PDF
  3. P Stuart et al. Analysis of RUNX1 binding site and RAPTOR polymorphisms in psoriasis: no evidence for association despite adequate power and evidence for linkage. American Journal of Human Genetics. Sept 2006. Download here
  4. Joshua T Burdick, Wei-Min Chen, Gonc¸alo R Abecasis and Vivian G Cheung, In silico method for inferring genotypes in pedigrees, Nature genetics.August 2006. Download here
  5. Gonçalo R. Abecasis, Stacey S. Cherny, William O. Cookson1 & Lon R. Cardon. Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nature Genetics.Dec 2001. PDF
  6. Scheet, P and Stephens, M A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic Phase.  American Journal of Human Genetics. 2006. PDF
  7. Stephens, M and Scheet, P. Accounting for Decay of Linkage Disequilibrium in Haplotype Inference and Missing-Data Imputation, American Journal of Human Genetics. 2005. PDF
  8. M. Fishelson and D. Geiger. Exact Genetic Linkage Computations for General Pedigrees.  Bioinformatics, Volume 18 Suppl. 1: S189-S198 (July 2002). Also presented in ISMB2002 (August, 2002). PDF
  9. M. Fishelson and D. Geiger Optimizing Exact Genetic Linkage Computations.  Journal of Computational Biology Volume 11(2-3): 263-75 (2004). Also presented in RECOMB2003 (April, 2003). PS
  10. M. Fishelson, N. Dovgolevsky and D. Geiger. Maximum Likelihood Haplotyping for General Pedigrees. Technical Report CS-2004-13. To appear in Human Heredity, 2005. PDF
  11. M. Fishelson, N. Dovgolevsky and D. Geiger. Maximum Likelihood Haplotyping for General Pedigrees. Human Heredity 2004. PDF
  12. Yun Ju Sung, Elizabeth A. Thompson and Ellen M. Wijsman MCMC-Based Linkage Analysis for Complex Traits on General Pedigrees: Multipoint Analysis With a Two-Locus Model and a Polygenic Component, Genetic Epidemiology 2007. PDF

II. Hypergraph analysis of graphical models (the hypertree width)

  1. Sathiamoorthy Subbarayan and Henrik Reif Andersen, Backtracking Procedures for Hypertree, HyperSpread and Connected Hypertree Decomposition of CSPs, IJCAI 2007. PDF
  2. Marko Samer and Stefan Szeider, Constraint Satisfaction with Bounded Treewidth Revisited, CP 2006. PDF
  3. Francesco Scarcello, Gianluigi Greco and Nicola Leone, Weighted Hypertree Decomposition and Optimal Query Plans. PODS 2004. PDF
  4. Yong Gao and Joseph Culberson. Consistency and Random Constraint Satisfaction Models. JAIR 2006. PDF

III.  Inference

  1. Amir Globerson and Tommi Jaakkola, Approximate inference using conditional entropy decompositions, AISTATS 2007. PDF
  2. Timothee Cour and Jianbo Shi, Solving Markov Random Fields with Spectral Relaxation, AISTATS 2007. PDF
  3. Joris Mooij, Bastian Wemmenhove, Bert Kappen and Tommaso Rizzo, Loop Corrected Belief Propagation, AISTATS 2007. PDF
  4. Edward Snelson and Zoubin Ghahramani, Local and global sparse Gaussian process approximations. AISTATS 2007. PDF
  5. Gregory Druck, Mukund Narasimhan and Paul Viola. Learning A* underestimates: Using inference to guide inference. AISTATS 2007. PDF
  6. M. J. Wainwright, T. Jaakkola and A. S. Willsky. A new class of upper bounds on the log partition function. IEEE Trans. on Information Theory, vol. 51, page 2313--2335, July 2005. PDF
  7. M. J. Wainwright, and M. I. Jordan. Graphical models, exponential families, and variational inference. UC Berkeley, Dept. of Statistics, Technical Report 649. September, 2003. PDF







(a) Visual-based experience (Ramesh Jain, ppt), (b) In-house competition overview (Vibhav Gogate), (c)  Empirical results with optimization-based AOBDD (Radu Marinescu) and (d) Overview of Linkage Analysis (Rina Dechter)



(a) Continued Background on Linkage Analysis (Rina Dechter, ppt1, ppt2), (b) Overview of the SampleSearch scheme (Vibhav Gogate, ppt)



(a) Overview of Linkage Analysis (Rina Dechter), (b) Continuous Time Bayesian Networks (Guy Yosiphon) and (c) Finding hypertree width (Lars Otten)



(a) Finding hypertree width (Lars Otten, pdf), (b) Linkage Disequilibrium Mapping and HaploBlock (Rina Dechter, pdf) and (c)  Mapping by Admixture Linkage Disequilibrium (Rina Dechter, ppt)



(a) Paper by Yun Ju Sung, Elizabeth A. Thompson and Ellen M. Wijsman (Vibhav Gogate, ppt), (b) Paper by Gonçalo Abecasis and Janis Wigginton (Radu Marinescu, ppt)



 More on Continuous time bayesian networks and possible connections to stochastic grammars (Guy Yosiphon)



(a) More on Linkage analysis (Vibhav Gogate, ppt) and (b) Radio advertisement (Google)