The confidence interval of entropy estimation through a noisy channel

Suppose a stationary memoryless source is observed through a discrete memoryless channel. Determining analytical confidence intervals on the source entropy is known to be a difficult problem, even when the observation channel is noiseless. In this paper, we determine confidence intervals for estimation of source entropy over discrete memoryless channels with invertible transition matrices. A lower bound is given for the minimum number of samples required to guarantee a desired confidence interval. All these results do not require any prior knowledge of the source distribution, other than the alphabet size. When the alphabet size is countably infinite or unknown, we illustrate an inherent difficulty in estimating the source entropy.

[1]  E. Ordentlich,et al.  Inequalities for the L1 Deviation of the Empirical Distribution , 2003 .

[2]  Raymond W. Yeung,et al.  The Interplay Between Entropy and Variational Distance , 2007, IEEE Transactions on Information Theory.

[3]  Liam Paninski,et al.  Estimation of Entropy and Mutual Information , 2003, Neural Computation.

[4]  Dana Angluin,et al.  Learning Markov chains with variable memory length from noisy output , 1997, COLT '97.

[5]  Tsachy Weissman,et al.  Universal Filtering Via Hidden Markov Modeling , 2008, IEEE Transactions on Information Theory.

[6]  A. Antos,et al.  Convergence properties of functional estimates for discrete distributions , 2001 .

[7]  R. Conant Estimation of entropy of a binary variable: Satisfying a reliability criterion , 1973, Kybernetik.

[8]  Sanjeev R. Kulkarni,et al.  Universal entropy estimation via block sorting , 2004, IEEE Transactions on Information Theory.

[9]  Tsachy Weissman,et al.  Universal discrete denoising: known channel , 2003, IEEE Transactions on Information Theory.

[10]  Raymond W. Yeung,et al.  Information Theory and Network Coding , 2008 .

[11]  G. Basharin On a Statistical Estimate for the Entropy of a Sequence of Independent Random Variables , 1959 .

[12]  Jonathon Shlens,et al.  Estimating Entropy Rates with Bayesian Confidence Intervals , 2005, Neural Computation.

[13]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Raymond W. Yeung,et al.  On Information Divergence Measures and a Unified Typicality , 2006, IEEE Transactions on Information Theory.