Universal prediction of random binary sequences in a noisy environment

Let $X=\{(X_t, Y_t) \}_{t \in \mathbb{Z}}$ be a stationary time series where $X_t$ is binary valued and $Y_t$, the noisy observation of $X_t$, is real valued. Letting $\mathbf{P}$ denote the probability measure governing the joint process $\{(X_t, Y_t)\}$, we characterize $U(l, \mathbf{P})$, the optimal asymptotic average performance of a predictor allowed to base its prediction for $X_t$ on $Y_1, \ldots, Y_{t-1}$, where performance is evaluated using the loss function $l$. It is shown that the stationarity and ergodicity of $\mathbf{P}$, combined with an additional "conditional mixing" condition, suffice to establish $U(l, \mathbf{P})$ as the fundamental limit for the almost sure asymptotic performance. $U(l, \mathbf{P})$ can thus be thought of as a generalized notion of the Shannon entropy, which can capture the sensitivity of the underlying clean sequence to noise. For the case where $\mathbf{X}=\{ X_t \}$ is governed by $P$ and $Y_t$ given by $Y_t=g(X_t, N_t)$ where $g$ is any deterministic function and $\mathbf{N}=\{ N_t \}$, the noise, is any i.i.d. process independent of $\mathbf{X}$ (namely, the case where the "clean" process $\mathbf{X}$ is passed through a fixed memoryless channel), it is shown that, analogously to the noiseless case, there exist universal predictors which do not depend on $P$ yet attain $U(l, \mathbf{P})$. Furthermore, it is shown that in some special cases of interest [e.g., the binary symmetric channel (BSC) and the absolute loss function], there exist twofold universal predictors which do not depend on the noise distribution either. The existence of such universal predictors is established by means of an explicit construction which builds on recent advances in the theory of prediction of individual sequences in the presence of noise.

[1]  Neri Merhav,et al.  Universal prediction of individual sequences , 1992, IEEE Trans. Inf. Theory.

[2]  László Györfi,et al.  A simple randomized algorithm for sequential prediction of ergodic time series , 1999, IEEE Trans. Inf. Theory.

[3]  David Haussler,et al.  Sequential Prediction of Individual Sequences Under General Loss Functions , 1998, IEEE Trans. Inf. Theory.

[4]  Cheng-Tie Chen,et al.  Robust Wiener filtering for multiple inputs with channel distortion , 1984, IEEE Trans. Inf. Theory.

[5]  P. Hall,et al.  Martingale Limit Theory and its Application. , 1984 .

[6]  R. Atar,et al.  Lyapunov Exponents for Finite State Nonlinear Filtering , 1997 .

[7]  Vladimir Vovk,et al.  Aggregating strategies , 1990, COLT '90.

[8]  Neri Merhav,et al.  Universal Prediction , 1998, IEEE Trans. Inf. Theory.

[9]  Saleem A. Kassam,et al.  Robust Wiener filters , 1977 .

[10]  J. Franke Minimax-robust prediction of discrete time series , 1985 .

[11]  Tsachy Weissman,et al.  Twofold universal prediction schemes for achieving the finite-state predictability of a noisy individual binary sequence , 2001, IEEE Trans. Inf. Theory.

[12]  Vladimir Vovk,et al.  A game of prediction with expert advice , 1995, COLT '95.

[13]  Vittorio Castelli,et al.  On the exponential value of labeled samples , 1995, Pattern Recognit. Lett..

[14]  S.A. Kassam,et al.  Robust techniques for signal processing: A survey , 1985, Proceedings of the IEEE.

[15]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[16]  G. Lugosi,et al.  Strategies for Sequential Prediction of Stationary Time Series , 2000 .

[17]  Tsachy Weissman,et al.  Universal prediction of individual binary sequences in the presence of noise , 2001, IEEE Trans. Inf. Theory.

[18]  L. Breiman A Note on Minimax Filtering , 1973 .

[19]  K. Vastola,et al.  Robust Wiener-Kolmogorov theory , 1984, IEEE Trans. Inf. Theory.

[20]  P. Hall,et al.  Martingale Limit Theory and Its Application , 1980 .

[21]  C. SIAMJ. LYAPUNOV EXPONENTS FOR FINITE STATE NONLINEAR FILTERING , 1997 .

[22]  Paul H. Algoet,et al.  The strong law of large numbers for sequential decisions under uncertainty , 1994, IEEE Trans. Inf. Theory.