Temporally-Reweighted Chinese Restaurant Process Mixtures for Clustering, Imputing, and Forecasting Multivariate Time Series

This article proposes a Bayesian nonparametric method for forecasting, imputation, and clustering in sparsely observed, multivariate time series data. The method is appropriate for jointly modeling hundreds of time series with widely varying, non-stationary dynamics. Given a collection of $N$ time series, the Bayesian model first partitions them into independent clusters using a Chinese restaurant process prior. Within a cluster, all time series are modeled jointly using a novel "temporally-reweighted" extension of the Chinese restaurant process mixture. Markov chain Monte Carlo techniques are used to obtain samples from the posterior distribution, which are then used to form predictive inferences. We apply the technique to challenging forecasting and imputation tasks using seasonal flu data from the US Center for Disease Control and Prevention, demonstrating superior forecasting accuracy and competitive imputation accuracy as compared to multiple widely used baselines. We further show that the model discovers interpretable clusters in datasets with hundreds of time series, using macroeconomic data from the Gapminder Foundation.

[1]  Matthew J. Johnson,et al.  Bayesian nonparametric hidden semi-Markov models , 2012, J. Mach. Learn. Res..

[2]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[3]  Feras Saad,et al.  Detecting Dependencies in Sparse, Multivariate Databases Using Probabilistic Programming and Non-parametric Bayes , 2016, AISTATS.

[4]  D. Dunson,et al.  BAYESIAN GENERALIZED PRODUCT PARTITION MODEL , 2010 .

[5]  Nick S. Jones,et al.  Highly Comparative Feature-Based Time-Series Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  Eric P. Xing,et al.  Dynamic Non-Parametric Mixture Models and the Recurrent Chinese Restaurant Process: with Applications to Evolutionary Clustering , 2008, SDM.

[7]  G. Koop Forecasting with Medium and Large Bayesian VARs , 2013 .

[8]  Babak Shahbaba,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2007, J. Mach. Learn. Res..

[9]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[10]  Arnaud Doucet,et al.  Bayesian Inference for Linear Dynamic Models With Dirichlet Process Mixtures , 2007, IEEE Transactions on Signal Processing.

[11]  Kostas Stathis,et al.  Probabilistic Programming with Gaussian Process Memoization , 2015, ArXiv.

[12]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[13]  Enrique ter Horst,et al.  Bayesian dynamic density estimation , 2008 .

[14]  Joshua B. Tenenbaum,et al.  CrossCat: A Fully Bayesian Nonparametric Method for Analyzing Heterogeneous, High Dimensional Data , 2015, J. Mach. Learn. Res..

[15]  Gary King,et al.  Amelia II: A Program for Missing Data , 2011 .

[16]  H. Robbins The Empirical Bayes Approach to Statistical Decision Problems , 1964 .

[17]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[18]  Fernando Quintana,et al.  A Bayesian Non‐Parametric Dynamic AR Model for Multiple Time Series Analysis , 2016 .

[19]  Nicholas G. Polson,et al.  Particle Learning and Smoothing , 2010, 1011.1098.

[20]  N. Pillai,et al.  Bayesian density regression , 2007 .

[21]  Feras Saad,et al.  A Probabilistic Programming Approach To Probabilistic Data Analysis , 2016, NIPS.

[22]  Neil D. Lawrence,et al.  Sparse Convolved Gaussian Processes for Multi-output Regression , 2008, NIPS.

[23]  David Barber,et al.  Expectation Correction for Smoothed Inference in Switching Linear Dynamical Systems , 2006, J. Mach. Learn. Res..

[24]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[25]  Yee Whye Teh,et al.  Dirichlet Process , 2017, Encyclopedia of Machine Learning and Data Mining.

[26]  Lancelot F. James,et al.  Generalized weighted Chinese restaurant processes for species sampling mixture models , 2003 .

[27]  Benjamin Letham,et al.  Forecasting at Scale , 2018, PeerJ Prepr..

[28]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[29]  P. Müller,et al.  Random Partition Models with Regression on Covariates. , 2010, Journal of statistical planning and inference.

[30]  M. Tanner,et al.  Facilitating the Gibbs Sampler: The Gibbs Stopper and the Griddy-Gibbs Sampler , 1992 .

[31]  Michael I. Jordan,et al.  Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.

[32]  Mike West,et al.  Bayesian online variable selection and scalable multivariate volatility forecasting in simultaneous graphical dynamic linear models , 2016, 1606.08291.

[33]  Alberto Contreras-Cristán,et al.  A Bayesian Nonparametric Approach for Time Series Clustering , 2014 .

[34]  D. Aldous Exchangeability and related topics , 1985 .

[35]  Peter Müller,et al.  A Product Partition Model With Regression on Covariates , 2011, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.