Fast Nonparametric Clustering of Structured Time-Series

In this publication, we combine two Bayesian nonparametric models: the Gaussian Process (GP) and the Dirichlet Process (DP). Our innovation in the GP model is to introduce a variation on the GP prior which enables us to model structured time-series data, i.e., data containing groups where we wish to model inter- and intra-group variability. Our innovation in the DP model is an implementation of a new fast collapsed variational inference procedure which enables us to optimize our variational approximation significantly faster than standard VB approaches. In a biological time series application we show how our model better captures salient features of the data, leading to better consistency with existing biological classifications, while the associated inference algorithm provides a significant speed-up over EM-based variational inference.

[1]  Magnus Rattray,et al.  The Circadian Clock in Murine Chondrocytes Regulates Genes Controlling Key Aspects of Cartilage Homeostasis , 2013, Arthritis and rheumatism.

[2]  Masa-aki Sato,et al.  Online Model Selection Based on the Variational Bayes , 2001, Neural Computation.

[3]  Paul D. W. Kirk,et al.  Bayesian hierarchical clustering for microarray time series data with replicates and outlier measurements , 2011, BMC Bioinformatics.

[4]  A. U.S.,et al.  Hierarchical Models for Assessing Variability among Functions , 2005 .

[5]  Neil D. Lawrence,et al.  A Simple Approach to Ranking Differentially Expressed Gene Expression Time Courses through Gaussian Process Regression , 2011, BMC Bioinformatics.

[6]  Zoubin Ghahramani,et al.  A Robust Bayesian Two-Sample Test for Detecting Intervals of Differential Gene Expression in Microarray Time Series , 2009, RECOMB.

[7]  Geoffrey E. Hinton,et al.  Split and Merge EM Algorithm for Improving Gaussian Mixture Density Estimates , 2000, J. VLSI Signal Process..

[8]  Neil D. Lawrence,et al.  Overlapping Mixtures of Gaussian Processes for the Data Association Problem , 2011, Pattern Recognit..

[9]  Erin L. McDearmon,et al.  Circadian and CLOCK-controlled regulation of the mouse transcriptome and cell proliferation , 2007, Proceedings of the National Academy of Sciences.

[10]  Antti Honkela,et al.  Model-based method for transcription factor target identification with limited data , 2010, Proceedings of the National Academy of Sciences.

[11]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[12]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[13]  Juha Karhunen,et al.  A gradient-based algorithm competitive with variational Bayesian EM for mixture of Gaussians , 2009, 2009 International Joint Conference on Neural Networks.

[14]  Dave T. Gerrard,et al.  Gene expression divergence recapitulates the developmental hourglass model , 2010, Nature.

[15]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[16]  Neil D. Lawrence,et al.  Gaussian process models for periodicity detection , 2013, 1303.7090.

[17]  Neil D. Lawrence,et al.  Fast Variational Inference for Gaussian Process Models Through KL-Correction , 2006, ECML.

[18]  Juha Karhunen,et al.  Approximate Riemannian Conjugate Gradient Learning for Fixed-Form Variational Bayes , 2010, J. Mach. Learn. Res..

[19]  Neil D. Lawrence,et al.  Fast Variational Inference in the Conjugate Exponential Family , 2012, NIPS.

[20]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[21]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[22]  Yee Whye Teh,et al.  A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2006, NIPS.

[23]  Zoubin Ghahramani,et al.  Latent-Space Variational Bayes , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yee Whye Teh,et al.  Collapsed Variational Dirichlet Process Mixture Models , 2007, IJCAI.

[25]  Sunho Park,et al.  Hierarchical Gaussian Process Regression , 2010, ACML.

[26]  Miguel Lázaro-Gredilla,et al.  Variational Heteroscedastic Gaussian Process Regression , 2011, ICML.

[27]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[28]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[29]  Ka Yee Yeung,et al.  Bayesian mixture model based clustering of replicated microarray data , 2004, Bioinform..