Feature Interaction Maximisation

Feature selection plays an important role in classification algorithms. It is particularly useful in dimensionality reduction for selecting features with high discriminative power. This paper introduces a new feature-selection method called Feature Interaction Maximisation (FIM), which employs three-way interaction information as a measure of feature redundancy. It uses a forward greedy search to select features which have maximum interaction information with the features already selected, and which provide maximum relevance. The experiments conducted to verify the performance of the proposed method use three datasets from the UCI repository. The method is compared with four other well-known feature-selection methods: Information Gain (IG), Minimum Redundancy Maximum Relevance (mRMR), Double Input Symmetrical Relevance (DISR), and Interaction Gain Based Feature Selection (IGFS). The average classification accuracy of two classifiers, Naive Bayes and K-nearest neighbour, is used to assess the performance of the new feature-selection method. The results show that FIM outperforms the other methods.

[1]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[2]  Dae-Won Kim,et al.  Feature selection for multi-label classification using multivariate mutual information , 2013, Pattern Recognit. Lett..

[3]  Gianluca Bontempi,et al.  On the Use of Variable Complementarity for Feature Selection in Cancer Classification , 2006, EvoWorkshops.

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[6]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[7]  Asha Gowda Karegowda,et al.  Feature Subset Selection Problem using Wrapper Approach in Supervised Learning , 2010 .

[8]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[9]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Yong Wang,et al.  Conditional Mutual Information‐Based Feature Selection Analyzing for Synergy and Redundancy , 2011, ETRI Journal.

[11]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[12]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[13]  Driss Aboutajdine,et al.  A Powerful Feature Selection approach based on Mutual Information , 2008 .

[14]  David G. Stork,et al.  Pattern Classification , 1973 .

[15]  Thomas M. Cover,et al.  Elements of Information Theory: Cover/Elements of Information Theory, Second Edition , 2005 .

[16]  Aleks Jakulin Machine Learning Based on Attribute Interactions , 2005 .

[17]  Chong-Ho Choi,et al.  Input feature selection for classification problems , 2002, IEEE Trans. Neural Networks.

[18]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[19]  Lei Liu,et al.  Feature selection with dynamic mutual information , 2009, Pattern Recognit..

[20]  Thy-Hou Lin,et al.  Implementing the Fisher's Discriminant Ratio in a k-Means Clustering Algorithm for Feature Selection and Data Set Trimming , 2004, Journal of Chemical Information and Modeling.

[21]  J. Rodgers,et al.  Thirteen ways to look at the correlation coefficient , 1988 .

[22]  Aleks Jakulin,et al.  Attribute Interactions in Machine Learning , 2003 .

[23]  Colas Schretter,et al.  Information-Theoretic Feature Selection in Microarray Data Using Variable Complementarity , 2008, IEEE Journal of Selected Topics in Signal Processing.