On universal simulation of information sources using training data

We consider the problem of universal simulation of an unknown random process, or information source, of a certain parametric family, given a training sequence from that source and given a limited budget of purely random bits. The goal is to generate another random sequence (of the same length or shorter), whose probability law is identical to that of the given training sequence, but with minimum statistical dependency (minimum mutual information) between the input training sequence and the output sequence. We derive lower bounds on the mutual information that are shown to he achievable by conceptually simple algorithms proposed here. We show that the behavior of the minimum achievable mutual information depends critically on the relative amount of random bits and on the lengths of the input sequence and the output sequence. While in the ordinary (nonuniversal) simulation problem, the number of random bits per symbol must exceed the entropy rate H of the source in order to simulate it faithfully, in the universal simulation problem considered here, faithful preservation of the probability law is not a problem, yet the same minimum rate of H random bits per symbol is still needed to essentially eliminate the statistical dependency between the input sequence and the output sequence. The results are extended to more general information measures.

[1]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[2]  Robert G. Gallager,et al.  Variations on a theme by Huffman , 1978, IEEE Trans. Inf. Theory.

[3]  Mamoru Hoshi,et al.  Interval algorithm for random number generation , 1997, IEEE Trans. Inf. Theory.

[4]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[5]  Harish Viswanathan,et al.  Optimal placement of training for frequency-selective block-fading channels , 2002, IEEE Trans. Inf. Theory.

[6]  Sergio Verdú,et al.  Approximation theory of output statistics , 1993, IEEE Trans. Inf. Theory.

[7]  P. Whittle,et al.  Some Distribution and Moment Formulae for the Markov Chain , 1955 .

[8]  Thomas M. Cover,et al.  Enumerative source encoding , 1973, IEEE Trans. Inf. Theory.

[9]  Fumio Kanaya,et al.  Channel simulation by interval algorithm: A performance analysis of interval algorithm , 1999, IEEE Trans. Inf. Theory.

[10]  R. Rao,et al.  Normal Approximation and Asymptotic Expansions , 1976 .

[11]  Sergio Verdú,et al.  Simulation of random processes and rate-distortion theory , 1996, IEEE Trans. Inf. Theory.

[12]  Jorma Rissanen,et al.  Generalized Kraft Inequality and Arithmetic Coding , 1976, IBM J. Res. Dev..

[13]  Andrew R. Barron,et al.  Information-theoretic asymptotics of Bayes methods , 1990, IEEE Trans. Inf. Theory.

[14]  Edward A. Bender,et al.  Central and Local Limit Theorems Applied to Asymptotic Enumeration II: Multivariate Generating Functions , 1983, J. Comb. Theory, Ser. A.

[15]  Kevin Atteson,et al.  The asymptotic redundancy of Bayes rules for Markov chains , 1999, IEEE Trans. Inf. Theory.

[16]  Sanjeev R. Kulkarni,et al.  Separation of random number generation and resolvability , 2000, IEEE Trans. Inf. Theory.

[17]  J. Ziv,et al.  A Generalization of the Rate-Distortion Theory and Applications , 1975 .

[18]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[19]  Thomas Kailath,et al.  On the capacity of frequency- selective channels in training-based transmission schemes , 2004, IEEE Transactions on Signal Processing.

[20]  Shlomo Shamai,et al.  Fading channels: How perfect need "Perfect side information" be? , 2002, IEEE Trans. Inf. Theory.

[21]  Neri Merhav Achievable key rates for universal simulation of random data with respect to a set of statistical tests , 2004, IEEE Transactions on Information Theory.

[22]  Jacob Ziv,et al.  On functionals satisfying a data-processing theorem , 1973, IEEE Trans. Inf. Theory.

[23]  N. Merhav,et al.  On universal simulation of information sources using training data , 2002, Proceedings IEEE International Symposium on Information Theory,.

[24]  Sergio Verdú,et al.  Channel simulation and coding with side information , 1994, IEEE Trans. Inf. Theory.

[25]  A. A. Borovkov,et al.  Integro-Local Limit Theorems Including Large Deviations for Sums of Random Vectors. II , 1999 .

[26]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[27]  Prakash Narayan,et al.  Reliable Communication Under Channel Uncertainty , 1998, IEEE Trans. Inf. Theory.

[28]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.