Deep Switching Auto-Regressive Factorization: Application to Time Series Forecasting

We introduce deep switching auto-regressive factorization (DSARF), a deep generative model for spatio-temporal data with the capability to unravel recurring patterns in the data and perform robust short- and long-term predictions. Similar to other factor analysis methods, DSARF approximates high dimensional data by a product between time dependent weights and spatially dependent factors. These weights and factors are in turn represented in terms of lower dimensional latent variables that are inferred using stochastic variational inference. DSARF is different from the state-of-the-art techniques in that it parameterizes the weights in terms of a deep switching vector auto-regressive likelihood governed with a Markovian prior, which is able to capture the non-linear inter-dependencies among weights to characterize multimodal temporal dynamics. This results in a flexible hierarchical deep generative factor analysis model that can be extended to (i) provide a collection of potentially interpretable states abstracted from the process dynamics, and (ii) perform short- and long-term vector time series prediction in a complex multi-relational setting. Our extensive experiments, which include simulated data and real data from a wide range of applications such as climate change, weather forecasting, traffic, infectious disease spread and nonlinear physical systems attest the superior performance of DSARF in terms of long- and short-term prediction error, when compared with the state-of-the-art methods.

[1]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[2]  G. Box Robustness in the Strategy of Scientific Model Building. , 1979 .

[3]  Liqing Zhang,et al.  Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Geoffrey E. Hinton,et al.  Switching State-Space Models , 1996 .

[5]  James D. Hamilton Analysis of time series subject to changes in regime , 1990 .

[6]  Inderjit S. Dhillon,et al.  Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction , 2016, NIPS.

[7]  Shou-De Lin,et al.  A Memory-Network Based Solution for Multivariate Time-Series Forecasting , 2018, ArXiv.

[8]  Matthias W. Seeger,et al.  Deep State Space Models for Time Series Forecasting , 2018, NeurIPS.

[9]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[10]  Patrick van der Smagt,et al.  Switching Linear Dynamics for Variational Bayes Filtering , 2019, ICML.

[11]  Wenhu Chen,et al.  Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting , 2019, NeurIPS.

[12]  Ole Winther,et al.  A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning , 2017, NIPS.

[13]  Hsiang-Fu Yu,et al.  Think Globally, Act Locally: A Deep Neural Network Approach to High-Dimensional Time Series Forecasting , 2019, NeurIPS.

[14]  Gabriel Taubin,et al.  Falling with Style: Bats Perform Complex Aerial Rotations by Adjusting Wing Inertia , 2015, PLoS biology.

[15]  Harit Pandya,et al.  Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces , 2019, ICML.

[16]  W. P. M. H. Heemels,et al.  A Bayesian approach to identification of hybrid systems , 2004, IEEE Transactions on Automatic Control.

[17]  Jian Chen,et al.  Quantifying the complexity of bat wing kinematics. , 2008, Journal of theoretical biology.

[18]  Brian Litt,et al.  Modeling the complex dynamics and changing correlations of epileptic events , 2014, Artif. Intell..

[19]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[20]  Eduardo Sontag Nonlinear regulation: The piecewise linear approach , 1981 .

[21]  Uri Shalit,et al.  Structured Inference Networks for Nonlinear State Space Models , 2016, AAAI.

[22]  Xi Chen,et al.  Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization , 2010, SDM.

[23]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[24]  Michael I. Jordan,et al.  Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.

[25]  David M. Blei,et al.  Dynamic Poisson Factorization , 2015, RecSys.

[26]  Qi Yu,et al.  Fast Multivariate Spatio-temporal Analysis via Low Rank Tensor Learning , 2014, NIPS.

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  S. M. Tan,et al.  Double pendulum: An experiment in chaos , 1993 .

[29]  Koh Takeuchi,et al.  Autoregressive Tensor Factorization for Spatio-Temporal Predictions , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[30]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[31]  Guokun Lai,et al.  Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.

[32]  Lei Li,et al.  Multilinear Dynamical Systems for Tensor Time Series , 2013, NIPS.

[33]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[34]  Uri Shalit,et al.  Deep Kalman Filters , 2015, ArXiv.

[35]  M. Athans,et al.  State Estimation for Discrete Systems with Switching Parameters , 1978, IEEE Transactions on Aerospace and Electronic Systems.

[36]  Jan-Willem van de Meent,et al.  Deep Markov Spatio-Temporal Factorization , 2020, ArXiv.

[37]  Kush R. Varshney,et al.  Collaborative Kalman Filtering for Dynamic Matrix Factorization , 2014, IEEE Transactions on Signal Processing.

[38]  R. Buizza Chaos and weather prediction January 2000 , 2002 .

[39]  Maximilian Karl,et al.  Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data , 2016, ICLR.

[40]  Xiao Jin,et al.  High-Order Temporal Correlation Model Learning for Time-Series Prediction , 2019, IEEE Transactions on Cybernetics.

[41]  Chong Wang,et al.  An Adaptive Learning Rate for Stochastic Variational Inference , 2013, ICML.

[42]  Valentin Flunkert,et al.  DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[43]  K. Ito,et al.  On State Estimation in Switching Environments , 1970 .

[44]  Lijun Sun,et al.  Bayesian Temporal Factorization for Multidimensional Time Series Prediction , 2021, IEEE transactions on pattern analysis and machine intelligence.

[45]  Kevin Murphy,et al.  Switching Kalman Filters , 1998 .

[46]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[47]  Jiawei Wang,et al.  Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model , 2019, Transportation Research Part C: Emerging Technologies.

[48]  M. Jorge Cardoso,et al.  Robust training of recurrent neural networks to handle missing data for disease progression modeling , 2018, ArXiv.

[49]  Jes Frellsen,et al.  MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets , 2019, ICML.

[50]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[51]  Hanghang Tong,et al.  Facets: Fast Comprehensive Mining of Coevolving High-order Time Series , 2015, KDD.

[52]  Scott W. Linderman,et al.  Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems , 2017, AISTATS.

[53]  René Vidal,et al.  Identification of Hybrid Systems: A Tutorial , 2007, Eur. J. Control.

[54]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[55]  Geoffrey E. Hinton,et al.  Variational Learning for Switching State-Space Models , 2000, Neural Computation.

[56]  Mónica F. Bugallo,et al.  Tree-Structured Recurrent Switching Linear Dynamical Systems for Multi-Scale Modeling , 2018, ICLR.