Discovering shared and individual latent structure in multiple time series

This paper proposes a nonparametric Bayesian method for exploratory data analysis and feature construction in continuous time series. Our method focuses on understanding shared features in a set of time series that exhibit significant individual variability. Our method builds on the framework of latent Diricihlet allocation (LDA) and its extension to hierarchical Dirichlet processes, which allows us to characterize each series as switching between latent ``topics'', where each topic is characterized as a distribution over ``words'' that specify the series dynamics. However, unlike standard applications of LDA, we discover the words as we learn the model. We apply this model to the task of tracking the physiological signals of premature infants; our model obtains clinically significant insights as well as useful features for supervised learning tasks.

[1]  Y. Bar-Shalom Tracking and data association , 1988 .

[2]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[3]  D. Richardson,et al.  Score for Neonatal Acute Physiology: a physiologic severity index for neonatal intensive care. , 1993, Pediatrics.

[4]  H. Ishwaran,et al.  Exact and approximate sum representations for the Dirichlet process , 2002 .

[5]  K. Williams,et al.  Intrapartum fetal heart rate patterns in the prediction of neonatal acidemia. , 2003, American journal of obstetrics and gynecology.

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  John A. Quinn,et al.  Factorial Switching Kalman Filters for Condition Monitoring in Neonatal Intensive Care , 2005, NIPS.

[9]  F. Harrell,et al.  Heart Rate Characteristics: Novel Physiomarkers to Predict Neonatal Infection and Death , 2005, Pediatrics.

[10]  F. Harrell,et al.  Heart rate characteristics: novel physiomarkers to predict neonatal infection and death. , 2005, Pediatrics.

[11]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[12]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[13]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[14]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[15]  Yee Whye Teh,et al.  Collapsed Variational Dirichlet Process Mixture Models , 2007, IJCAI.

[16]  Chong Wang,et al.  Continuous Time Dynamic Topic Models , 2008, UAI.

[17]  Michael I. Jordan,et al.  Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.

[18]  Michael I. Jordan,et al.  Sharing Features among Dynamical Systems with Beta Processes , 2009, NIPS.

[19]  Eamonn J. Keogh,et al.  Exact Discovery of Time Series Motifs , 2009, SDM.

[20]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[21]  Michael I. Jordan,et al.  The Sticky HDP-HMM: Bayesian Nonparametric Hidden Markov Models with Persistent States , 2009 .

[22]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[23]  John A. Quinn,et al.  Factorial Switching Linear Dynamical Systems Applied to Physiological Condition Monitoring , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.