Normalized information-based divergences

This paper is devoted to the mathematical study of some divergences based on mutual information which are well suited to categorical random vectors. These divergences are generalizations of the “entropy distance” and “information distance.” Their main characteristic is that they combine a complexity term and the mutual information. We then introduce the notion of (normalized) information-based divergence, propose several examples, and discuss their mathematical properties, in particular, in some prediction framework.

[1]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[2]  Ronald de Wolf,et al.  Algorithmic clustering of music , 2003, Proceedings of the Fourth International Conference onWeb Delivering of Music, 2004. EDELMUSIC 2004..

[3]  Xin Chen,et al.  An information-based sequence distance and its application to whole mitochondrial genome phylogeny , 2001, Bioinform..

[4]  Alexei Kaltchenko,et al.  Algorithms for estimating information distance with application to bioinformatics and linguistics , 2004, Canadian Conference on Electrical and Computer Engineering 2004 (IEEE Cat. No.04CH37513).

[5]  E. Maasoumi,et al.  A Dependence Metric for Possibly Nonlinear Processes , 2004 .

[6]  Paul M. B. Vitányi,et al.  Automatic Meaning Discovery Using Google , 2006, Kolmogorov Complexity and Applications.

[7]  Joe Celko,et al.  E‐Mail Addresses , 1996 .

[8]  Thomas M. Cover,et al.  Some equivalences between Shannon entropy and Kolmogorov complexity , 1978, IEEE Trans. Inf. Theory.

[9]  J. Crutchfield Information and Its Metric , 1990 .

[10]  Paul M. B. Vitányi,et al.  Clustering by compression , 2003, IEEE Transactions on Information Theory.

[11]  Chris Hillman,et al.  A FORMAL THEORY OF INFORMATION: I. STATICS , 1998 .

[12]  Thomas M. Cover,et al.  Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .

[13]  Jean-François Robineau Méthodes de sélection de variables, parmi un grand nombre, dans un cadre de discrimation , 2004 .

[14]  Paul M. B. Vitányi,et al.  Algorithm Clustering of Music , 2003 .

[15]  Nikolai K. Vereshchagin,et al.  Inequalities for Shannon Entropy and Kolmogorov Complexity , 1997, J. Comput. Syst. Sci..

[16]  Paul M. B. Vitányi,et al.  Shannon Information and Kolmogorov Complexity , 2004, ArXiv.

[17]  Paul M. B. Vitányi,et al.  An Introduction to Kolmogorov Complexity and Its Applications, Third Edition , 1997, Texts in Computer Science.

[18]  Claude E. Shannon,et al.  A mathematical theory of communication , 1948, MOCO.

[19]  Alexander Kraskov,et al.  Hierarchical Clustering Based on Mutual Information , 2003, ArXiv.

[20]  Bin Ma,et al.  The similarity metric , 2001, IEEE Transactions on Information Theory.

[21]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[22]  A. Ullah Entropy, divergence and distance measures with econometric applications , 1996 .

[23]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[24]  A. Kaitchenko Algorithms for estimating information distance with application to bioinformatics and linguistics , 2004 .