On Characterization of Entropy Function via Information Inequalities

Given n discrete random variables /spl Omega/={X/sub 1/, /spl middot//spl middot//spl middot/, X/sub n/}, associated with any subset /spl alpha/ of (1, 2, /spl middot//spl middot//spl middot/, n), there is a joint entropy H(X/sub /spl alpha//) where X/sub /spl alpha//={X/sub i/:i/spl epsiv//spl alpha/}. This can be viewed as a function defined on 2/sup {1, 2, /spl middot//spl middot//spl middot/, n}/ taking values in (0, +/spl infin/). We call this function the entropy function of /spl Omega/. The nonnegativity of the joint entropies implies that this function is nonnegative; the nonnegativity of the conditional joint entropies implies that this function is nondecreasing; and the nonnegativity of the conditional mutual information implies that this function has the following property: for any two subsets /spl alpha/ and /spl beta/ of {1, 2, /spl middot//spl middot//spl middot/, n} H/sub /spl Omega//(/spl alpha/)+H/sub /spl Omega//(/spl beta/)/spl ges/H/sub /spl Omega//(/spl alpha//spl cup//spl beta/)+H/sub /spl Omega//(/spl alpha//spl cap//spl beta/). These properties are the so-called basic information inequalities of Shannon's information measures. Do these properties fully characterize the entropy function? To make this question more precise, we view an entropy function as a 2/sup n/-1-dimensional vector where the coordinates are indexed by the nonempty subsets of the ground set {1, 2, /spl middot//spl middot//spl middot/, n}. Let /spl Gamma//sub n/ be the cone in R/sup 2n-1/ consisting of all vectors which have these three properties when they are viewed as functions defined on 2/sup {1, 2, /spl middot//spl middot//spl middot/, n}/. Let /spl Gamma//sub n/* be the set of all 2/sup n/-1-dimensional vectors which correspond to the entropy functions of some sets of n discrete random variables. The question can be restated as: is it true that for any n, /spl Gamma/~/sub n/*=/spl Gamma//sub n/? Here /spl Gamma/~/sub n/* stands for the closure of the set /spl Gamma//sub n/*. The answer is "yes" when n=2 and 3 as proved in our previous work. Based on intuition, one may tend to believe that the answer should be "yes" for any n. The main discovery of this paper is a new information-theoretic inequality involving four discrete random variables which gives a negative answer to this fundamental problem in information theory: /spl Gamma/~*/sub n/ is strictly smaller than /spl Gamma//sub n/ whenever n>3. While this new inequality gives a nontrivial outer bound to the cone /spl Gamma/~/sub 4/*, an inner bound for /spl Gamma/~*/sub 4/ is also given. The inequality is also extended to any number of random variables.

[1]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[2]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[3]  Milan Studený Attempts at axiomatic description of conditional independence , 1989, Kybernetika.

[4]  H. Saunders,et al.  Probability, Random Variables and Stochastic Processes (2nd Edition) , 1989 .

[5]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[6]  Raymond W. Yeung,et al.  A new outlook of Shannon's information measures , 1991, IEEE Trans. Inf. Theory.

[7]  Michael Satosi Watanabe,et al.  Information Theoretical Analysis of Multivariate Correlation , 1960, IBM J. Res. Dev..

[8]  Te Sun Han,et al.  Linear Dependence Structure of the Entropy Space , 1975, Inf. Control..

[9]  Fazlollah M. Reza,et al.  Introduction to Information Theory , 2004, Lecture Notes in Electrical Engineering.

[10]  Frantisek Matús,et al.  Extreme convex set functions with many nonnegative differences , 1994, Discret. Math..

[11]  Zhen Zhang,et al.  A non-Shannon-type conditional inequality of information quantities , 1997, IEEE Trans. Inf. Theory.

[12]  Raymond W. Yeung,et al.  An information-theoretic characterization of Markov random fields and its applications , 1998, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[13]  Frantisek Matús Abstract Functional Dependency Structures , 1991, Theor. Comput. Sci..

[14]  F. Matús PROBABILISTIC CONDITIONAL INDEPENDENCE STRUCTURES AND MATROID THEORY: BACKGROUND1 , 1993 .

[15]  Milan Studeny,et al.  Conditional independence relations have no finite complete characterization , 1992 .

[16]  F. Matús Ascending And Descending Conditional Independence Relations , 1992 .

[17]  L. L. CAMPBELL,et al.  Entropy as a measure , 1965, IEEE Trans. Inf. Theory.

[18]  Raymond W. Yeung,et al.  A framework for linear information inequalities , 1997, IEEE Trans. Inf. Theory.

[19]  Te Sun Han Nonnegative Entropy Measures of Multivariate Symmetric Correlations , 1978, Inf. Control..

[20]  J G Daugman,et al.  Information Theory and Coding , 1998 .

[21]  William J. McGill Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.

[22]  Masud Mansuripur,et al.  Introduction to information theory , 1986 .

[23]  Tsutomu Kawabata,et al.  The structure of the I-measure of a Markov chain , 1992, IEEE Trans. Inf. Theory.

[24]  T. Tsujishita,et al.  On Triple Mutual Information , 1994 .

[25]  Satoru Fujishige,et al.  Polymatroidal Dependence Structure of a Set of Random Variables , 1978, Inf. Control..

[26]  F. Matús On equivalence of Markov properties over undirected graphs , 1992, Journal of Applied Probability.

[27]  F. Mattt Conditional Independence Structures Examined via Minors , 1997 .

[28]  O. Antoine,et al.  Theory of Error-correcting Codes , 2022 .

[29]  Satosi Watanabe A study of ergodicity and redundancy based on intersymbol correlation of finite range , 1954, Trans. IRE Prof. Group Inf. Theory.

[30]  Milan Studený,et al.  Conditional Independences among Four Random Variables 1 , 1995, Comb. Probab. Comput..

[31]  Zhen Zhang,et al.  Distributed Source Coding for Satellite Communications , 1999, IEEE Trans. Inf. Theory.

[32]  Frantisek Matús,et al.  Conditional Independences among Four Random Variables II , 1995, Combinatorics, Probability and Computing.

[33]  Hu Kuo Ting,et al.  On the Amount of Information , 1962 .