Occam's razor in dimension reduction: Using reduced row Echelon form for finding linear independent features in high dimensional microarray datasets

Microarray high dimensional datasets suffer from small sample size and extreme large number of features. Therefore, feature selection plays crucial roles on the performance of the trained models on those datasets. A typical feature selection method consists of two main parts, problem criterion and a search strategy. The common datasets dont have huge number of features with respect to their number of samples; hence, a search strategy in their feature selection methods were able to seek the search space. In contrast, microarray high dimensional datasets have huge number of features; therefore, their search space is very large and searching that space is a prohibitive action. In this paper, we take into account the philosophy of Occam's razor in feature subset selection in order to release high dimensional datasets from computational search methods. The proposed method uses two stages for feature selection. In the first stage features are rearranged by their importance in the dataset and in the second stage, the fundamental concept of reduced row Echelon form is applied on dataset in order to find linear independent features. For determining the effectiveness of the proposed method some experiments are carried out on nine binary microarray high dimensional datasets. The obtained results are compared with eleven state-of-the-art feature selection algorithms including Correlation based Feature Selection (CFS), Fast Correlation Based Filter (FCBF), Interact (INT) and Maximum Relevancy Minimum Redundancy (MRMR). The average outcomes of the results are analyzed by a statistical non-parametric test and it reveals that the proposed method has a meaningful superiority to the others in terms of accuracy, sensitivity, specificity, G-mean, number of selected features and computational complexity.

[1]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[2]  Verónica Bolón-Canedo,et al.  Data classification using an ensemble of filters , 2014, Neurocomputing.

[3]  Mohammad Kazem Ebrahimpour,et al.  Proposing a novel feature selection algorithm based on Hesitant Fuzzy Sets and correlation concepts , 2015, 2015 The International Symposium on Artificial Intelligence and Signal Processing (AISP).

[4]  Ramazan Coban,et al.  Identification of linear dynamic systems using the artificial bee colony algorithm , 2012 .

[5]  Mohammad Kazem Ebrahimpour,et al.  Feature subset selection using Information Energy and correlation coefficients of hesitant fuzzy sets , 2015, 2015 7th Conference on Information and Knowledge Technology (IKT).

[6]  Yunming Ye,et al.  Stratified sampling for feature subspace selection in random forests for high dimensional data , 2013, Pattern Recognit..

[7]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[8]  Hui-Huang Hsu,et al.  Hybrid feature selection by combining filters and wrappers , 2011, Expert Syst. Appl..

[9]  Salwani Abdullah,et al.  Hybridizing relieff, mRMR filters and GA wrapper approaches for gene selection , 2012 .

[10]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[11]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[12]  F. Azuaje,et al.  Multiple SVM-RFE for gene selection in cancer classification with expression data , 2005, IEEE Transactions on NanoBioscience.

[13]  Arunkumar Chinnaswamy,et al.  Hybrid Feature Selection Using Correlation Coefficient and Particle Swarm Optimization on Microarray Gene Expression Data , 2015, IBICA.

[14]  Verónica Bolón-Canedo,et al.  Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset , 2011, Expert Syst. Appl..

[15]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[16]  J.C. Rajapakse,et al.  SVM-RFE With MRMR Filter for Gene Selection , 2010, IEEE Transactions on NanoBioscience.

[17]  Le Song,et al.  Feature Selection via Dependence Maximization , 2012, J. Mach. Learn. Res..

[18]  Shailendra Singh,et al.  Review On Feature Selection Approaches Using Gene Expression Data , 2016 .

[19]  Francisco Herrera,et al.  Study on the Impact of Partition-Induced Dataset Shift on $k$-Fold Cross-Validation , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Qinbao Song,et al.  A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[21]  Varsha S. Sonwane A Fast Clustering-Based Feature Subset Selection Algorithm , 2017 .

[22]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[23]  Yukyee Leung,et al.  A Multiple-Filter-Multiple-Wrapper Approach to Gene Selection and Microarray Data Classification , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Giovanni Iacca,et al.  Ockham's Razor in memetic computing: Three stage optimal memetic exploration , 2012, Inf. Sci..

[25]  Enrique Alba,et al.  Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments , 2016, Appl. Soft Comput..

[26]  Huan Liu,et al.  Searching for Interacting Features , 2007, IJCAI.

[27]  Jing Wang,et al.  Unsupervised feature selection through Gram-Schmidt orthogonalization - A word co-occurrence perspective , 2016, Neurocomputing.

[28]  Mohammad Kazem Ebrahimpour,et al.  Ensemble of feature selection methods: A hesitant fuzzy sets approach , 2017, Appl. Soft Comput..

[29]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[30]  Antônio de Pádua Braga,et al.  GA-KDE-Bayes: an evolutionary wrapper method based on non-parametric density estimation applied to bioinformatics problems , 2013, ESANN.

[31]  Verónica Bolón-Canedo,et al.  A review of microarray datasets and applied feature selection methods , 2014, Inf. Sci..

[32]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[33]  C. B. Millham,et al.  Introduction to Linear Algebra , 1972 .

[34]  Jugal K. Kalita,et al.  MIFS-ND: A mutual information-based feature selection method , 2014, Expert Syst. Appl..

[35]  E Theodorsson-Norheim,et al.  Friedman and Quade tests: BASIC computer program to perform nonparametric two-way analysis of variance and multiple comparisons on ranks of several related samples. , 1987, Computers in biology and medicine.

[36]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[37]  Beatriz A. Garro,et al.  Classification of DNA microarrays using artificial neural networks and ABC algorithm , 2016, Appl. Soft Comput..

[38]  Pradipta Maji,et al.  Rough set based maximum relevance-maximum significance criterion and Gene selection from microarray data , 2011, Int. J. Approx. Reason..