A New Criterion of Mutual Information Using R-value

Mutual information has wide area of application including feature selection and classification. To calculate mutual information, statistical equation of information theory has been used. In this paper, we propose a new criterion for mutual information. It is based on R-value which captures overlapping areas among classes in variables (features). Overlapping area of classes reflects uncertainty of the variables; it corresponds to the meaning of entropy. We compare traditional mutual information and R-value on the context of feature selection. From the experiment we confirm that proposed method shows better performance than traditional mutual information.

[1]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[2]  Chee Keong Kwoh,et al.  A Feature Subset Selection Method Based On High-Dimensional Mutual Information , 2011, Entropy.

[3]  Bao-Gang Hu,et al.  Mutual information based on Renyi's entropy feature selection , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[4]  Werner Dubitzky,et al.  A Practical Approach to Microarray Data Analysis , 2003, Springer US.

[5]  Sejong Oh A new dataset evaluation method based on category overlap , 2011, Comput. Biol. Medicine.

[6]  Christophe Moulin,et al.  Entropy based feature selection for text categorization , 2011, SAC.

[7]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[8]  M K Markey,et al.  Application of the mutual information criterion for feature selection in computer-aided diagnosis. , 2001, Medical physics.

[9]  Sejong Oh,et al.  RFS: Efficient feature selection method based on R-value , 2013, Comput. Biol. Medicine.

[10]  Silviu Guiaşu,et al.  Information theory with applications , 1977 .

[11]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[12]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[13]  Ivan Bratko,et al.  Attribute Interactions in Medical Data Analysis , 2003, AIME.