Optimal prediction of data with unknown abrupt change points

We develop a novel methodology for predicting time series under unknown abrupt changes in data generating distributions. Based on Kolmogorov and Tikhomirov's e entropy, we propose a concept called e-predictability that quantifies the size of a model class and the maximal number of structural changes that allows the achievability of asymptotic optimal prediction. To predict under abrupt changes, our basic idea is to apply e-net to discretize a nonparametric or parametric model class with an appropriately chosen e, and then apply a kinetic model averaging over the quantizers. Under reasonable assumptions, we prove that the average predictive performance is asymptotically as good as the oracle, i.e. when all the data generating distributions are known in advance. We show that the assumptions hold for a rather wide class of time variations. The results also address some puzzles related to the “prediction-inference dilemma” in the context of change point analysis.

[1]  M. A. Girshick,et al.  Bayes and minimax solutions of sequential decision problems , 1949 .

[2]  Yi-Ching Yao Estimating the number of change-points via Schwarz' criterion , 1988 .

[3]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[4]  Jie Ding,et al.  Multiple Change Point Analysis: Fast Implementation and Strong Consistency , 2016, IEEE Transactions on Signal Processing.

[5]  Huimin Li,et al.  Sudden changes in volatility in emerging markets: The case of Gulf Arab stock markets , 2008 .

[6]  R. Khan,et al.  Sequential Tests of Statistical Hypotheses. , 1972 .

[7]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[8]  Marina Thottan,et al.  Anomaly detection in IP networks , 2003, IEEE Trans. Signal Process..

[9]  Igor Krupnik,et al.  Matching Traditional and Scientific Observations to Detect Environmental Change: A Discussion on Arctic Terrestrial Ecosystems , 2004, AMBIO: A Journal of the Human Environment.

[10]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[11]  Yuhong Yang Can the Strengths of AIC and BIC Be Shared , 2005 .

[12]  Michèle Basseville,et al.  Detection of abrupt changes: theory and application , 1993 .

[13]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[14]  S. Lauritzen,et al.  Proper local scoring rules , 2011, 1101.5011.

[15]  J.J. Vidal,et al.  Real-time detection of brain events in EEG , 1977, Proceedings of the IEEE.

[16]  S. Hawkins,et al.  Detection of environmental change in a marine ecosystem--evidence from the western English Channel. , 2003, The Science of the total environment.

[17]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[18]  H. Akaike Fitting autoregressive models for prediction , 1969 .

[19]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[20]  Jie Ding,et al.  Bridging AIC and BIC: A New Criterion for Autoregression , 2015, IEEE Transactions on Information Theory.

[21]  S. Kou,et al.  Stepwise Signal Extraction via Marginal Likelihood , 2016, Journal of the American Statistical Association.

[22]  F. Gustafsson The marginalized likelihood ratio test for detecting abrupt changes , 1996, IEEE Trans. Autom. Control..