Robust Feature Selection from Microarray Data Based on Cooperative Game Theory and Qualitative Mutual Information

High dimensionality of microarray data sets may lead to low efficiency and overfitting. In this paper, a multiphase cooperative game theoretic feature selection approach is proposed for microarray data classification. In the first phase, due to high dimension of microarray data sets, the features are reduced using one of the two filter-based feature selection methods, namely, mutual information and Fisher ratio. In the second phase, Shapley index is used to evaluate the power of each feature. The main innovation of the proposed approach is to employ Qualitative Mutual Information (QMI) for this purpose. The idea of Qualitative Mutual Information causes the selected features to have more stability and this stability helps to deal with the problem of data imbalance and scarcity. In the third phase, a forward selection scheme is applied which uses a scoring function to weight each feature. The performance of the proposed method is compared with other popular feature selection algorithms such as Fisher ratio, minimum redundancy maximum relevance, and previous works on cooperative game based feature selection. The average classification accuracy on eleven microarray data sets shows that the proposed method improves both average accuracy and average stability compared to other approaches.

[1]  Jianzhong Wang,et al.  Maximum weight and minimum redundancy: A novel framework for feature subset selection , 2013, Pattern Recognit..

[2]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Feng Yang,et al.  Robust Feature Selection for Microarray Data Based on Multicriterion Fusion , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[5]  Eytan Ruppin,et al.  Feature Selection via Coalitional Game Theory , 2007, Neural Computation.

[6]  Kun She,et al.  Feature Selection with Neighborhood Entropy-Based Cooperative Game Theory , 2014, Comput. Intell. Neurosci..

[7]  Yuhua Qian,et al.  QMIQPN: An enhanced QPN based on qualitative mutual information for reducing ambiguity , 2014, Knowl. Based Syst..

[8]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[9]  Mohammad Hossein Moattar,et al.  Robust and stable feature selection by integrating ranking methods and wrapper technique in genetic data classification. , 2014, Biochemical and biophysical research communications.

[10]  Parham Moradi,et al.  Gene selection for microarray data classification using a novel ant colony optimization , 2015, Neurocomputing.

[11]  Philip Wolfe,et al.  Contributions to the theory of games , 1953 .

[12]  Jin Li,et al.  Feature evaluation and selection with cooperative game theory , 2012, Pattern Recognit..

[13]  Lawrence O. Hall,et al.  Iterative Feature perturbation as a gene Selector for microarray Data , 2012, Int. J. Pattern Recognit. Artif. Intell..

[14]  Antônio de Pádua Braga,et al.  GA-KDE-Bayes: an evolutionary wrapper method based on non-parametric density estimation applied to bioinformatics problems , 2013, ESANN.

[15]  Gavin Brown,et al.  A New Perspective for Information Theoretic Feature Selection , 2009, AISTATS.

[16]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[17]  Iñaki Inza,et al.  Gene selection by sequential search wrapper approaches in microarray cancer class prediction , 2002, J. Intell. Fuzzy Syst..

[18]  Jin Li,et al.  Using cooperative game theory to optimize the feature selection problem , 2012, Neurocomputing.

[19]  Verónica Bolón-Canedo,et al.  A review of microarray datasets and applied feature selection methods , 2014, Inf. Sci..

[20]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[21]  L. Shapley A Value for n-person Games , 1988 .

[22]  Jesús S. Aguilar-Ruiz,et al.  Incremental wrapper-based gene selection from microarray data for cancer classification , 2006, Pattern Recognit..

[23]  Colas Schretter,et al.  Information-Theoretic Feature Selection in Microarray Data Using Variable Complementarity , 2008, IEEE Journal of Selected Topics in Signal Processing.

[24]  Saeid Nahavandi,et al.  A novel aggregate gene selection method for microarray data classification , 2015, Pattern Recognit. Lett..

[25]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[26]  Satoru Miyano,et al.  A Top-r Feature Selection Algorithm for Microarray Gene Expression Data , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[27]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[28]  Dinggang Shen,et al.  Multi-modal Image Registration by Quantitative-Qualitative Measure of Mutual Information (Q-MI) , 2005, CVBIA.

[29]  Ghada Hany Badr,et al.  Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification , 2015, Comput. Biol. Chem..

[30]  Enrique Alba,et al.  Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments , 2016, Appl. Soft Comput..

[31]  K. Thangavel,et al.  Dimensionality reduction based on rough set theory: A review , 2009, Appl. Soft Comput..