Clustering Using Elements of Information Theory

This paper proposes an algorithm for clustering using an information-theoretic based criterion. The cross entropy between elements in different clusters is used as a measure of quality of the partition. The proposed algorithm uses "classical" clustering algorithms to initialize some small regions (auxiliary clusters) that will be merged to construct the final clusters. The algorithm was tested using several databases with different spatial distributions.

[1]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[2]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[3]  J. Príncipe,et al.  Information-Theoretic Learning Using Renyi's Quadratic Entropy , 1999 .

[4]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[5]  Thomas M. Cover,et al.  Elements of information theory (2. ed.) , 2006 .

[6]  Deniz Erdogmus,et al.  Information Theoretic Learning , 2005, Encyclopedia of Artificial Intelligence.

[7]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[8]  José Carlos Príncipe,et al.  Mean shift: An information theoretic perspective , 2009, Pattern Recognit. Lett..

[9]  Ian Witten,et al.  Data Mining , 2000 .

[10]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[11]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[12]  Ian H. Witten,et al.  Data Mining: Practical Machine Learning Tools and Techniques, 3/E , 2014 .

[13]  Xianggui Qu,et al.  Multivariate Data Analysis , 2007, Technometrics.

[14]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[15]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[16]  David G. Stork,et al.  Pattern Classification , 1973 .

[17]  José Carlos Príncipe,et al.  Information Theoretic Clustering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  J.D. de Melo,et al.  Clustering using neural networks and Kullback-Leibler divergency , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).