Entropy Bounds for Discrete Random Variables via Maximal Coupling

This paper derives new bounds on the difference of the entropies of two discrete random variables in terms of the local and total variation distances between their probability mass functions. The derivation of the bounds relies on maximal coupling, and they apply to discrete random variables which are defined over finite or countably infinite alphabets. Loosened versions of these bounds are demonstrated to reproduce some previously reported results. The use of the new bounds is exemplified for the Poisson approximation, where bounds on the local and total variation distances follow from Stein's method.

[1]  Amiel Feinstein,et al.  Information and information stability of random variables and processes , 1964 .

[2]  Imre Csiszár,et al.  Information Theory and Statistics: A Tutorial , 2004, Found. Trends Commun. Inf. Theory.

[3]  R. Gray Entropy and Information Theory , 1990, Springer New York.

[4]  Nathan Ross,et al.  Total variation error bounds for geometric approximation , 2010 .

[5]  P. Harremoes,et al.  Entropy and the law of small numbers , 2003, IEEE International Symposium on Information Theory, 2003. Proceedings..

[6]  Ina Fourie,et al.  Entropy and Information Theory (2nd ed.) , 2012 .

[7]  S. Karlin,et al.  Entropy inequalities for classes of probability distributions I. The univariate case , 1981, Advances in Applied Probability.

[8]  Joseph Lipka,et al.  A Table of Integrals , 2010 .

[9]  Raymond W. Yeung,et al.  The Interplay Between Entropy and Variational Distance , 2010, IEEE Trans. Inf. Theory.

[10]  Peter Harremoës,et al.  Binomial and Poisson distributions as maximum entropy distributions , 2001, IEEE Trans. Inf. Theory.

[11]  Jack K. Wolf,et al.  On the T-user M-frequency noiseless multiple-access channel with and without intensity information , 1981, IEEE Trans. Inf. Theory.

[12]  Nathan Ross Fundamentals of Stein's method , 2011, 1109.1880.

[13]  P. Harremoës Information Topologies with Applications , 2007 .

[14]  Mark Semenovich Pinsker On Estimation of Information via Variation , 2005, Probl. Inf. Transm..

[15]  Mladen Kovacevic,et al.  On the entropy of couplings , 2013, Inf. Comput..

[16]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[17]  Ron Goldman,et al.  Poisson approximation , 2000, Proceedings Geometric Modeling and Processing 2000. Theory and Applications.

[18]  Sergio Verdú,et al.  On the Interplay Between Conditional Entropy and Error Probability , 2010, IEEE Transactions on Information Theory.

[19]  P. Hall,et al.  On the rate of Poisson convergence , 1984, Mathematical Proceedings of the Cambridge Philosophical Society.

[20]  L. Gordon,et al.  Two moments su ce for Poisson approx-imations: the Chen-Stein method , 1989 .

[21]  Fraser Daly,et al.  Stein's Method for Compound Geometric Approximation , 2010, Journal of Applied Probability.

[22]  K. Teerapabolarn,et al.  Negative Binomial Approximation with Stein's Method and Stein's Identity , 2010 .

[23]  Raymond W. Yeung,et al.  On the Discontinuity of the Shannon Information Measures , 2005, IEEE Transactions on Information Theory.

[24]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[25]  Oliver Johnson,et al.  Entropy and the law of small numbers , 2005, IEEE Transactions on Information Theory.

[26]  I. M. Pyshik,et al.  Table of integrals, series, and products , 1965 .

[27]  Vyacheslav V. Prelov On computation of information via variation and inequalities for the entropy function , 2010, Probl. Inf. Transm..

[28]  Vyacheslav V. Prelov,et al.  On inequalities between mutual information and variation , 2007, Probl. Inf. Transm..

[29]  Raymond W. Yeung,et al.  The Interplay Between Entropy and Variational Distance , 2007, IEEE Transactions on Information Theory.

[30]  Yvik Swan,et al.  Stein's density approach for discrete distributions and information inequalities , 2012, ArXiv.

[31]  Vyacheslav V. Prelov Mutual information of several random variables and its estimation via variation , 2009, Probl. Inf. Transm..

[32]  Imre Csiszár,et al.  Information Theory - Coding Theorems for Discrete Memoryless Systems, Second Edition , 2011 .

[33]  Zhengmin Zhang,et al.  Estimating Mutual Information Via Kolmogorov Distance , 2007, IEEE Transactions on Information Theory.

[34]  Yaming Yu,et al.  Sharp Bounds on the Entropy of the Poisson Law and Related Quantities , 2010, IEEE Transactions on Information Theory.

[35]  Erol A. Peköz,et al.  Stein's method for geometric approximation , 1996, Journal of Applied Probability.

[36]  Mokshay M. Madiman,et al.  Compound Poisson Approximation via Information Functionals , 2010, ArXiv.

[37]  Flemming Topsøe,et al.  Basic Concepts, Identities and Inequalities - the Toolkit of Information Theory , 2001, Entropy.

[38]  Flemming Topsøe,et al.  Some inequalities for information divergence and related measures of discrimination , 2000, IEEE Trans. Inf. Theory.

[39]  Igal Sason On the entropy of sums of Bernoulli random variables via the Chen-Stein method , 2012, 2012 IEEE Information Theory Workshop.

[40]  Erol A. Peköz,et al.  A Second Course in Probability , 2007 .

[41]  Yvik Swan,et al.  Local Pinsker Inequalities via Stein's Discrete Density Approach , 2012, IEEE Transactions on Information Theory.

[42]  Igal Sason,et al.  An Information-Theoretic Perspective of the Poisson Approximation via the Chen-Stein Method , 2012, ArXiv.

[43]  Ingram Olkin,et al.  Entropy of the Sum of Independent Bernoulli Random Variables and of the Multinomial Distribution , 1981 .

[44]  Igal Sason,et al.  Improved Lower Bounds on the Total Variation Distance for the Poisson Approximation , 2013, 1301.7504.

[45]  Edward C. van der Meulen,et al.  Mutual information, variation, and Fano’s inequality , 2008, Probl. Inf. Transm..

[46]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .