On the Discontinuity of the Shannon Information Measures

The Shannon information measures are well known to be continuous functions of the probability distribution for a given finite alphabet. In this paper, however, we show that these measures are discontinuous with respect to almost all commonly used "distance" measures when the alphabet is countably infinite. Such "distance" measures include the Kullback-Leibler divergence and the variational distance. Specifically, we show that all the Shannon information measures are in fact discontinuous at all probability distributions. The proofs are based on a probability distribution which can be realized by a discrete-time Markov chain with countably infinite number of states. Our findings reveal that the limiting probability distribution may not fully characterize the asymptotic behavior of a Markov chain. These results explain why certain existing information-theoretical tools are restricted to finite alphabets, and provide hints on how these tools can be extended to countably infinite alphabet.

[1]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[2]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[3]  R. Rucker Infinity and the Mind: The Science and Philosophy of the Infinite , 1982 .

[4]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.

[5]  Sheldon M. Ross,et al.  Introduction to probability models , 1975 .

[6]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[7]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[8]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .

[9]  J. Lin,et al.  A NEW DIRECTED DIVERGENCE MEASURE AND ITS CHARACTERIZATION , 1990 .

[10]  Sheldon M. Ross,et al.  Introduction to Probability Models, Eighth Edition , 1972 .

[11]  Robert J. McEliece,et al.  The Theory of Information and Coding , 1979 .

[12]  Raymond W. Yeung,et al.  A First Course in Information Theory (Information Technology: Transmission, Processing and Storage) , 2006 .

[13]  A. Wehrl General properties of entropy , 1978 .

[14]  Raymond W. Yeung,et al.  A First Course in Information Theory , 2002 .

[15]  Siu-Wai Ho On the discontinuity of the Shannon information measures and typical sequences , 2005, ISIT.

[16]  John D. Barrow The Infinite Book: A Short Guide to the Boundless, Timeless and Endless , 2004 .

[17]  Andrei V. Kelarev,et al.  The Theory of Information and Coding , 2005 .

[18]  Flemming Topsøe,et al.  Basic Concepts, Identities and Inequalities - the Toolkit of Information Theory , 2001, Entropy.

[19]  P. Harremoës Information Topologies with Applications , 2007 .

[20]  Raymond W. Yeung,et al.  On Information Divergence Measures and a Unified Typicality , 2006, IEEE Transactions on Information Theory.