Rényi entropy and cauchy-schwartz mutual information applied to mifs-u variable selection algorithm: a comparative study

This paper approaches the algorithm of selection of variables named MIFS-U and presents an alternative method for estimating entropy and mutual information, "measures" that constitute the base of this selection algorithm. This method has, for foundation, the Cauchy-Schwartz quadratic mutual information and the Renyi quadratic entropy, combined, in the case of continuous variables, with Parzen Window density estimation. Experiments were accomplished with public domain data, being such method compared with the original MIFS-U algorithm, broadly used, that adopts the Shannon entropy definition and makes use, in the case of continuous variables, of the histogram density estimator. The results show small variations between the two methods, what suggest a future investigation using a classifier, such as Neural Networks, to qualitatively evaluate these results, in the light of the final objective which is greater accuracy of classification.

[1]  I. Jolliffe Principal Component Analysis , 2002 .

[2]  Deniz Erdogmus,et al.  Information Theoretic Learning , 2005, Encyclopedia of Artificial Intelligence.

[3]  J. Príncipe,et al.  Energy, entropy and information potential for neural computation , 1998 .

[4]  R. Hartley Transmission of information , 1928 .

[5]  John W. Fisher,et al.  A novel measure for independent component analysis (ICA) , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  J. Simonoff Multivariate Density Estimation , 1996 .

[7]  Chong-Ho Choi,et al.  Input feature selection for classification problems , 2002, IEEE Trans. Neural Networks.

[8]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[9]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[10]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[11]  Huzefa Firoz Neemuchwala,et al.  Entropic graphs for image registration. , 2005 .

[12]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[13]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[14]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[15]  John W. Fisher,et al.  Learning from Examples with Information Theoretic Criteria , 2000, J. VLSI Signal Process..

[16]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[17]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[18]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.