Generic Bounds On The Maximum Deviations In Sequential Prediction: An Information-Theoretic Analysis

In this paper, we derive generic bounds on the maximum deviations in prediction errors for sequential prediction via an information-theoretic approach. The fundamental bounds are shown to depend only on the conditional entropy of the data point to be predicted given the previous data points. In the asymptotic case, the bounds are achieved if and only if the prediction error is white and uniformly distributed.

[1]  Quanyan Zhu,et al.  Generic Variance Bounds on Estimation and Prediction Errors in Time Series Analysis: An Entropy Perspective , 2019, 2019 IEEE Information Theory Workshop (ITW).

[2]  C. Lee Giles,et al.  Sequence learning: from recognition and prediction to sequential decision making , 2001, IEEE Intelligent Systems.

[3]  Tryphon T. Georgiou,et al.  The Role of the Time-Arrow in Mean-Square Estimation of Stochastic Processes , 2015, IEEE Control Systems Letters.

[4]  Thomas Kailath,et al.  A view of three decades of linear filtering theory , 1974, IEEE Trans. Inf. Theory.

[5]  Jie Chen,et al.  An Integral Characterization of Optimal Error Covariance by Kalman Filtering , 2018, 2018 Annual American Control Conference (ACC).

[6]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[7]  Jie Chen,et al.  Towards Integrating Control and Information Theories , 2017 .

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[10]  Karthik Sridharan,et al.  Statistical Learning and Sequential Prediction , 2014 .

[11]  Robert M. Gray,et al.  Entropy and Information , 1990 .

[12]  G. Picci,et al.  Linear Stochastic Systems: A Geometric Approach to Modeling, Estimation and Identification , 2016 .

[13]  Thomas M. Cover,et al.  Elements of information theory (2. ed.) , 2006 .

[14]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[15]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[16]  D. Rajan Probability, Random Variables, and Stochastic Processes , 2017 .

[17]  Naftali Tishby,et al.  Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[18]  Naftali Tishby,et al.  Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.

[19]  Jie Chen,et al.  Fundamental error bounds in state estimation: An information-theoretic analysis , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[20]  Mohsen Pourahmadi,et al.  Foundations of Time Series Analysis and Prediction Theory , 2001 .