Context-tree weighting for real-valued time series: Bayesian inference with hierarchical mixture models

Real-valued time series are ubiquitous in the sciences and engineering. In this work, a general, hierarchical Bayesian modelling framework is developed for building mixture models for times series. This development is based, in part, on the use of context trees, and it includes a collection of effective algorithmic tools for learning and inference. A discrete context (or 'state') is extracted for each sample, consisting of a discretised version of some of the most recent observations preceding it. The set of all relevant contexts are represented as a discrete context-tree. At the bottom level, a different real-valued time series model is associated with each context-state, i.e., with each leaf of the tree. This defines a very general framework that can be used in conjunction with any existing model class to build flexible and interpretable mixture models. Extending the idea of context-tree weighting leads to algorithms that allow for efficient, exact Bayesian inference in this setting. The utility of the general framework is illustrated in detail when autoregressive (AR) models are used at the bottom level, resulting in a nonlinear AR mixture model. The associated methods are found to outperform several state-of-the-art techniques on simulated and real-world experiments.

[1]  Ioannis Kontoyiannis,et al.  Posterior Representations for Bayesian Context Trees: Sampling, Estimation and Convergence , 2022, Bayesian Analysis.

[2]  Ioannis Kontoyiannis,et al.  Truly Bayesian Entropy Estimation , 2022, 2023 IEEE Information Theory Workshop (ITW).

[3]  I. Kontoyiannis,et al.  Bayesian Change-Point Detection via Context-Tree Weighting , 2022, 2022 IEEE Information Theory Workshop (ITW).

[4]  I. Kontoyiannis,et al.  The Posterior Distribution of Bayesian Context-Tree Models: Theory and Applications , 2022, 2022 IEEE International Symposium on Information Theory (ISIT).

[5]  Ioannis Kontoyiannis,et al.  Change-point Detection and Segmentation of Discrete Data using Bayesian Context Trees , 2022, 2203.04341.

[6]  J. Ryu,et al.  Parameter-free Online Linear Optimization with Side Information via Universal Coin Betting , 2022, AISTATS.

[7]  Toshiyasu Matsushima,et al.  Probability Distribution on Full Rooted Trees , 2021, Entropy.

[8]  I. Kontoyiannis,et al.  Bayesian context trees: Modelling and exact inference for discrete time series , 2020, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[9]  Ioannis Kontoyiannis,et al.  Revisiting Context-Tree Weighting for Bayesian Inference , 2021, 2021 IEEE International Symposium on Information Theory (ISIT).

[10]  Moinak Maiti Threshold Autoregression , 2021, Applied Financial Econometrics.

[11]  Alfred O. Hero,et al.  Pattern-Based Analysis of Time Series: Estimation , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).

[12]  Valentin Flunkert,et al.  DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[13]  Christos Faloutsos,et al.  Forecasting Big Time Series: Old and New , 2018, Proc. VLDB Endow..

[14]  Evangelos Spiliotis,et al.  Statistical and Machine Learning forecasting methods: Concerns and ways forward , 2018, PloS one.

[15]  Inderjit S. Dhillon,et al.  Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction , 2016, NIPS.

[16]  Roger Frigola,et al.  Bayesian Time Series Learning with Gaussian Processes , 2015 .

[17]  Qinghua Hu,et al.  Pattern-Based Wind Speed Prediction Based on Generalized Principal Component Analysis , 2014, IEEE Transactions on Sustainable Energy.

[18]  S Roberts,et al.  Gaussian processes for time-series modelling , 2013, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[19]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[20]  Francisco Martinez Alvarez,et al.  Energy Time Series Forecasting Based on Pattern Sequence Similarity , 2011, IEEE Transactions on Knowledge and Data Engineering.

[21]  Yuan-Chun Jiang,et al.  A novel statistical time-series pattern based interval forecasting strategy for activity durations in workflow systems , 2011, J. Syst. Softw..

[22]  S. Wood Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models , 2011 .

[23]  B. Hansen Threshold autoregression in economics , 2011 .

[24]  Amir F. Atiya,et al.  An Empirical Comparison of Machine Learning Models for Time Series Forecasting , 2010 .

[25]  Gaoxiang Ouyang,et al.  Ordinal pattern based similarity analysis for EEG recordings , 2010, Clinical Neurophysiology.

[26]  Shigeichi Hirasawa,et al.  Reducing the space complexity of a Bayes coding algorithm using an expanded context tree , 2009, 2009 IEEE International Symposium on Information Theory.

[27]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[28]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[29]  Rob J Hyndman,et al.  Forecasting with Exponential Smoothing: The State Space Approach , 2008 .

[30]  Georg Zeitler,et al.  Universal Piecewise Linear Prediction Via Context Trees , 2007, IEEE Transactions on Signal Processing.

[31]  Tak-Chung Fu,et al.  Stock time series pattern matching: Template-based vs. rule-based approaches , 2007, Eng. Appl. Artif. Intell..

[32]  Stefano Alvisi,et al.  A short-term, pattern-based model for water-demand forecasting , 2006 .

[33]  Tommaso Proietti,et al.  Forecasting the US unemployment rate , 2003, Comput. Stat. Data Anal..

[34]  Eric R. Ziegel,et al.  Analysis of Financial Time Series , 2002, Technometrics.

[35]  S. Chib,et al.  Marginal Likelihood From the Metropolis–Hastings Output , 2001 .

[36]  W. Li,et al.  On a mixture autoregressive model , 2000 .

[37]  Frans M. J. Willems,et al.  The Context-Tree Weighting Method : Extensions , 1998, IEEE Trans. Inf. Theory.

[38]  Michael Y. Hu,et al.  Forecasting with artificial neural networks: The state of the art , 1997 .

[39]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[40]  Frans M. J. Willems,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[41]  Simon M. Potter A Nonlinear Approach to US GNP , 1995 .

[42]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[43]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[44]  H. Tong Non-linear time series. A dynamical system approach , 1990 .

[45]  JORMA RISSANEN,et al.  A universal data compression system , 1983, IEEE Trans. Inf. Theory.

[46]  H. Tong,et al.  Threshold Autoregression, Limit Cycles and Cyclical Data , 1980 .

[47]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.