Discrete universal filtering through incremental parsing

In the discrete filtering problem, a data sequence over a finite alphabet is assumed to be corrupted by a discrete memoryless channel. The goal is to reconstruct the clean sequence, with as high a fidelity as possible, by way of causal processing of the noisy sequence alone, with the reconstruction at time t depending only on noisy observations occurring no later than t. A universal version of this problem in which no assumptions are made about the distribution of the clean data, which may even be nonstochastic is studied. Using techniques from universal data compression, in particular, the incremental parsing rule of LZ78, and derives a practical and efficient algorithms for the universal filtering of discrete sources. A finite-memory filter of order k has the property that the reconstruction at any time t is a time-invariant function only of noisy observations occurring between times t-k and t, inclusive. The universal filtering algorithms perform essentially as well, in an expected sense (with respect to the noise process), as the best finite-memory filter of any fixed order, determined with full knowledge of the actual clean data sequence, for all such data sequences. Also consider more general finite-state filters and show that any such filter is arbitrarily well approximated by a finite-memory filter of growing order, thereby establishing the universality of the proposed algorithms with respect to this larger class. This result can be viewed as the filtering analogue of the well known optimality of LZ78 relative to the class of finite-state compressors.