Microarray cancer feature selection: Review, challenges and research directions

Abstract Microarray technology has become an emerging trend in the domain of genetic research in which many researchers employ to study and investigate the levels of genes’ expression in a given organism. Microarray experiments have lots of application areas in the health sector such as diseases prediction and diagnosis, cancer study and soon. The enormous quantity of raw gene expression data usually results in analytical and computational complexities which include feature selection and classification of the datasets into the correct class or group. To achieve satisfactory cancer classification accuracy with the complete set of genes remains a great challenge, due to the high dimensions, small sample size, and presence of noise in gene expression data. Feature reduction is critical and sensitive in the classification task. Therefore, this paper presents a comprehensive survey of studies on microarray cancer classification with a focus on feature selection methods. In this paper, the taxonomy of the various feature selection methods used for microarray cancer classification and open research issues have been extensively discussed.

[1]  Xuelong Li,et al.  Feature selection with multi-view data: A survey , 2019, Inf. Fusion.

[2]  Robinson Thamburaj,et al.  Automated Nuclear Pleomorphism Scoring in Breast Cancer Histopathology Images Using Deep Neural Networks , 2015, MIKE.

[3]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[4]  Jing Zhao,et al.  A Modified Ant Colony Optimization Algorithm for Tumor Marker Gene Selection , 2009, Genom. Proteom. Bioinform..

[5]  Qinbao Song,et al.  A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data , 2013, IEEE Transactions on Knowledge and Data Engineering.

[6]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[7]  Ali Anaissi,et al.  Feature Selection of Imbalanced Gene Expression Microarray Data , 2011, 2011 12th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing.

[8]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[9]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[10]  Lipo Wang,et al.  A Modified T-test Feature Selection Method and Its Application on the HapMap Genotype Data , 2008, Genom. Proteom. Bioinform..

[11]  J.C. Rajapakse,et al.  SVM-RFE With MRMR Filter for Gene Selection , 2010, IEEE Transactions on NanoBioscience.

[12]  M. Balafar,et al.  Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts. , 2017, Genomics.

[13]  Mingzhi Liao,et al.  Predicting human microRNA precursors based on an optimized feature subset generated by GA-SVM. , 2011, Genomics.

[14]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[15]  Verónica Bolón-Canedo,et al.  Multiclass classifiers vs multiple binary classifiers using filters for feature selection , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[16]  Seokjoo Yoon,et al.  Identification of potential biomarkers of genotoxicity and carcinogenicity in L5178Y mouse lymphoma cells by cDNA microarray analysis , 2005, Environmental and molecular mutagenesis.

[17]  Mingquan Ye,et al.  Hybrid Method Based on Information Gain and Support Vector Machine for Gene Selection in Cancer Classification , 2017, Genom. Proteom. Bioinform..

[18]  Xiaohui Cheng,et al.  Feature self-representation based hypergraph unsupervised feature selection via low-rank representation , 2017, Neurocomputing.

[19]  Yi Yang,et al.  Semisupervised Feature Selection via Spline Regression for Video Semantic Recognition , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Yukyee Leung,et al.  A Multiple-Filter-Multiple-Wrapper Approach to Gene Selection and Microarray Data Classification , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Gérard Dreyfus,et al.  Ranking a Random Feature for Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[22]  Lawrence O. Hall,et al.  Iterative Feature perturbation as a gene Selector for microarray Data , 2012, Int. J. Pattern Recognit. Artif. Intell..

[23]  Li-Yeh Chuang,et al.  A hybrid feature selection method for DNA microarray data , 2011, Comput. Biol. Medicine.

[24]  Ayman M. Eldeib,et al.  Breast cancer classification using deep belief networks , 2016, Expert Syst. Appl..

[25]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[26]  Urs A. Muller,et al.  Learning long-range vision for autonomous off-road driving , 2009 .

[27]  Verónica Bolón-Canedo,et al.  Distributed feature selection: An application to microarray data classification , 2015, Appl. Soft Comput..

[28]  David M. Rocke,et al.  Outlier detection in the multiple cluster setting using the minimum covariance determinant estimator , 2004, Comput. Stat. Data Anal..

[29]  Krzysztof Fujarewicz,et al.  Stable feature selection and classification algorithms for multiclass microarray data , 2012, Biology Direct.

[30]  Cheng Shi,et al.  Breast Cancer Malignancy Prediction Using Incremental Combination of Multiple Recurrent Neural Networks , 2017, ICONIP.

[31]  Walter Daelemans,et al.  Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language , 2003, ECML.

[32]  Igor Jurisica,et al.  Gene expression–based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study , 2008, Nature Medicine.

[33]  Rasmita Dash,et al.  A two stage grading approach for feature selection and classification of microarray data using Pareto based feature ranking techniques: A case study , 2017, J. King Saud Univ. Comput. Inf. Sci..

[34]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[35]  Madhubanti Maitra,et al.  Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique , 2015, Expert Syst. Appl..

[36]  D. Botstein,et al.  Diversity of gene expression in adenocarcinoma of the lung , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Zheng Guo,et al.  Learnability-based further prediction of gene functions in Gene Ontology. , 2004, Genomics.

[38]  U. Braga-Neto,et al.  Fads and fallacies in the name of small-sample microarray classification - A highlight of misunderstanding and erroneous usage in the applications of genomic signal processing , 2007, IEEE Signal Processing Magazine.

[39]  Lei Wang,et al.  Efficient Spectral Feature Selection with Minimum Redundancy , 2010, AAAI.

[40]  Jürgen Schmidhuber,et al.  Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.

[41]  Jürgen Schmidhuber,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[42]  S. Grellscheid,et al.  Applying genetic programming to the prediction of alternative mRNA splice variants. , 2007, Genomics.

[43]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[44]  Mary Monir Saeid,et al.  A microarray cancer classification technique based on discrete wavelet transform for data reduction and genetic algorithm for feature selection , 2020, 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184).

[45]  Omar Bonerge Pineda Lezama,et al.  Unbalanced data processing using oversampling: Machine Learning , 2020, FNC/MobiSPC.

[46]  Verónica Bolón-Canedo,et al.  On the effectiveness of discretization on gene selection of microarray data , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[47]  Wei Xiong,et al.  A DSRPCL-SVM Approach to Informative Gene Analysis , 2008, Genom. Proteom. Bioinform..

[48]  Vicente García,et al.  Gene selection and disease prediction from gene expression data using a two-stage hetero-associative memory , 2018, Progress in Artificial Intelligence.

[49]  Ghada Hany Badr,et al.  Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification , 2015, Comput. Biol. Chem..

[50]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[51]  Beatriz A. Garro,et al.  Classification of DNA microarrays using artificial neural networks and ABC algorithm , 2016, Appl. Soft Comput..

[52]  Mohammad Saniee Abadeh,et al.  Gene selection for cancer tumor detection using a novel memetic algorithm with a multi-view fitness function , 2013, Eng. Appl. Artif. Intell..

[53]  Li Li,et al.  A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset. , 2005, Genomics.

[54]  Huan Liu,et al.  Embedded Unsupervised Feature Selection , 2015, AAAI.

[55]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[56]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[57]  Duncan Fyfe Gillies,et al.  A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data , 2015, Adv. Bioinformatics.

[58]  Vitor Santos,et al.  Ensemble Feature Ranking Applied to Medical Data , 2014 .

[59]  Samuel H. Huang Supervised feature selection: A tutorial , 2015, Artif. Intell. Res..

[60]  Chris H. Q. Ding,et al.  Minimum Redundancy Feature Selection from Microarray Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[61]  Mário A. T. Figueiredo,et al.  An unsupervised approach to feature discretization and selection , 2012, Pattern Recognit..

[62]  M. M.S. Monobe,et al.  GENE EXPRESSION: AN OVERVIEW OF METHODS AND APPLICATIONS FOR CANCER RESEARCH , 2016 .

[63]  Mario Acunzo,et al.  MicroRNA and cancer--a brief overview. , 2015, Advances in biological regulation.

[64]  Xindong Wu,et al.  Online feature selection for high-dimensional class-imbalanced data , 2017, Knowl. Based Syst..

[65]  Kunihiko Fukushima,et al.  Cognitron: A self-organizing multilayered neural network , 1975, Biological Cybernetics.

[66]  Xiaowei Yang,et al.  An efficient gene selection algorithm based on mutual information , 2009, Neurocomputing.

[67]  Eivind Hovig,et al.  Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data , 2003, BMC Bioinformatics.

[68]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[69]  Sherif Sakr,et al.  Big Data Systems Meet Machine Learning Challenges: Towards Big Data Science as a Service , 2017, Big Data Res..

[70]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[71]  Rich Caruana,et al.  Benefitting from the Variables that Variable Selection Discards , 2003, J. Mach. Learn. Res..

[72]  Thibault Helleputte,et al.  Robust biomarker identification for cancer diagnosis with ensemble feature selection methods , 2010, Bioinform..

[73]  Daoqiang Zhang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[74]  Xing-Ming Zhao,et al.  Gene Expression Data Classification Using Consensus Independent Component Analysis , 2008, Genom. Proteom. Bioinform..

[75]  Feiping Nie,et al.  Feature selection under regularized orthogonal least square regression with optimal scaling , 2018, Neurocomputing.

[76]  John Quackenbush,et al.  Computational genetics: Computational analysis of microarray data , 2001, Nature Reviews Genetics.

[77]  Hossein Nezamabadi-pour,et al.  CCFS: A cooperating coevolution technique for large scale feature selection on microarray datasets , 2018, Comput. Biol. Chem..

[78]  Driss Aboutajdine,et al.  A two-stage gene selection scheme utilizing MRMR filter and GA wrapper , 2011, Knowledge and Information Systems.

[79]  David G. Stork,et al.  Pattern Classification , 1973 .

[80]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[81]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[82]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[83]  Ana Carolina Lorena,et al.  Analysis of complexity indices for classification problems: Cancer gene expression data , 2012, Neurocomputing.

[84]  P. Rajeswari,et al.  Human Liver Cancer Classification using Microarray Gene Expression Data , 2011 .

[85]  Nir Friedman,et al.  Tissue classification with gene expression profiles. , 2000 .

[86]  Satoru Miyano,et al.  A Top-r Feature Selection Algorithm for Microarray Gene Expression Data , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[87]  Andrew K. C. Wong,et al.  Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..

[88]  Supoj Hengpraprohm,et al.  Ensemble Feature Selection for Breast Cancer Classification using Microarray Data , 2020, Inteligencia Artif..

[89]  Jennifer L. Davidson,et al.  Feature selection for steganalysis using the Mahalanobis distance , 2010, Electronic Imaging.

[90]  I. Mian,et al.  Identifying marker genes in transcription profiling data using a mixture of feature relevance experts. , 2001, Physiological genomics.

[91]  Mohammad Wahab Khan,et al.  A survey of application: genomics and genetic programming, a new frontier. , 2012, Genomics.

[92]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[93]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[94]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[95]  Yi Yang,et al.  A Convex Formulation for Semi-Supervised Multi-Label Feature Selection , 2014, AAAI.

[96]  Antônio de Pádua Braga,et al.  GA-KDE-Bayes: an evolutionary wrapper method based on non-parametric density estimation applied to bioinformatics problems , 2013, ESANN.

[97]  Francisco Herrera,et al.  Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification , 2013, Pattern Recognit..

[98]  Jesús S. Aguilar-Ruiz,et al.  Incremental wrapper-based gene selection from microarray data for cancer classification , 2006, Pattern Recognit..

[99]  Julio López,et al.  Dealing with high-dimensional class-imbalanced datasets: Embedded feature selection for SVM classification , 2018, Appl. Soft Comput..

[100]  Qiang Cheng,et al.  The Fisher-Markov Selector: Fast Selecting Maximally Separable Feature Subset for Multiclass Classification with Applications to High-Dimensional Data , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[101]  Eréndira Rendón Lara,et al.  Performance Analysis of Deep Neural Networks for Classification of Gene-Expression Microarrays , 2018, MCPR.

[102]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[103]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[104]  Rok Blagus,et al.  Evaluation of SMOTE for High-Dimensional Class-Imbalanced Microarray Data , 2012, 2012 11th International Conference on Machine Learning and Applications.

[105]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[106]  Dong-Ling Tong,et al.  Hybrid genetic algorithm-neural network: Feature extraction for unpreprocessed microarray data , 2011, Artif. Intell. Medicine.

[107]  M Lipkin,et al.  Expression of cloned sequences in biopsies of human colonic tissue and in colonic carcinoma cells induced to differentiate in vitro. , 1987, Cancer research.

[108]  Francisco Herrera,et al.  Tutorial on practical tips of the most influential data preprocessing algorithms in data mining , 2016, Knowl. Based Syst..

[109]  Usman Qamar,et al.  MF-GARF: Hybridizing Multiple Filters and GA Wrapper for Feature Selection of Microarray Cancer Datasets , 2020, 2020 22nd International Conference on Advanced Communication Technology (ICACT).

[110]  Yungho Leu,et al.  A novel hybrid feature selection method for microarray data analysis , 2011, Appl. Soft Comput..

[111]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[112]  Lalitha Rangarajan,et al.  Bi-level dimensionality reduction methods using feature selection and feature extraction , 2010 .

[113]  Mario Marchand,et al.  Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[114]  Iñaki Inza,et al.  Gene selection by sequential search wrapper approaches in microarray cancer class prediction , 2002, J. Intell. Fuzzy Syst..

[115]  Yongming Li,et al.  Proportional Hybrid Mechanism for Population Based Feature Selection Algorithm , 2017, Int. J. Inf. Technol. Decis. Mak..

[116]  Huaijiang Sun,et al.  Ranking analysis for identifying differentially expressed genes. , 2011, Genomics.

[117]  Francisco Herrera,et al.  An insight into imbalanced Big Data classification: outcomes and challenges , 2017 .

[118]  Samuel Hellman,et al.  Receiver operating characteristic analysis: a general tool for DNA array data filtration and performance estimation. , 2003, Genomics.

[119]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[120]  Rasmita Dash,et al.  An Adaptive Harmony Search Approach for Gene Selection and Classification of High Dimensional Medical Data , 2018, J. King Saud Univ. Comput. Inf. Sci..

[121]  Verónica Bolón-Canedo,et al.  A review of microarray datasets and applied feature selection methods , 2014, Inf. Sci..

[122]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[123]  J. Perez-Polo,et al.  Statistical approach to DNA chip analysis. , 2003, Recent progress in hormone research.

[124]  Mohammad Hossein Moattar,et al.  A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. , 2016, Genomics.

[125]  W. Krzanowski Selection of Variables to Preserve Multivariate Data Structure, Using Principal Components , 1987 .

[126]  Krishna Rajan,et al.  Identification of biologically significant genes from combinatorial microarray data. , 2011, ACS combinatorial science.

[127]  T. H. Bø,et al.  New feature subset selection procedures for classification of expression profiles , 2002, Genome Biology.

[128]  Jin Cao,et al.  A fast gene selection method for multi-cancer classification using multiple support vector data description , 2015, J. Biomed. Informatics.

[129]  Rong Liu,et al.  Unsupervised Feature Selection Using Incremental Least Squares , 2011, Int. J. Inf. Technol. Decis. Mak..

[130]  Prashanth Suravajhala,et al.  Gene selection for tumor classification using a novel bio-inspired multi-objective approach. , 2018, Genomics.

[131]  Feiping Nie,et al.  Semi-Supervised Feature Selection via Insensitive Sparse Regression with Application to Video Semantic Recognition , 2018, IEEE Transactions on Knowledge and Data Engineering.

[132]  Huan Liu,et al.  Feature Selection: An Ever Evolving Frontier in Data Mining , 2010, FSDM.

[133]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[134]  Lluís A. Belanche Muñoz,et al.  Gene subset selection in microarray data using entropic filtering for cancer classification , 2009, Expert Syst. J. Knowl. Eng..

[135]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[136]  Stefan Michiels,et al.  Prediction of cancer outcome with microarrays: a multiple random validation strategy , 2005, The Lancet.

[137]  Yunming Ye,et al.  Stratified sampling for feature subspace selection in random forests for high dimensional data , 2013, Pattern Recognit..

[138]  Andreas Rytz,et al.  Microarray data analysis: a practical approach for selecting differentially expressed genes , 2001, Genome Biology.

[139]  Xueguang Shao,et al.  Selecting significant genes by randomization test for cancer classification using gene expression data , 2013, J. Biomed. Informatics.

[140]  Moshood A. Hambali,et al.  Ovarian Cancer Classification Using Hybrid Synthetic Minority Over-Sampling Technique and Neural Network , 2016 .

[141]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[142]  Reza Ebrahimpour,et al.  PPIevo: protein-protein interaction prediction from PSSM based evolutionary information. , 2013, Genomics.

[143]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[144]  Salwani Abdullah,et al.  Hybridizing relieff, mRMR filters and GA wrapper approaches for gene selection , 2012 .

[145]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[146]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[147]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[148]  Kashif Javed,et al.  Feature Selection Based on Class-Dependent Densities for High-Dimensional Binary Data , 2012, IEEE Transactions on Knowledge and Data Engineering.

[149]  Wei-Chung Cheng,et al.  THEME: A web tool for loop-design microarray data analysis , 2012, Comput. Biol. Medicine.

[150]  Jeyakumar Natarajan,et al.  Microarray Data Analysis and Mining Tools , 2011, Bioinformation.

[151]  Francisco Herrera,et al.  An Analysis of Local and Global Solutions to Address Big Data Imbalanced Classification: A Case Study with SMOTE Preprocessing , 2019, JCC&BD.

[152]  Salwani Abdullah,et al.  Hybridising harmony search with a Markov blanket for gene selection problems , 2014, Inf. Sci..

[153]  Oana Geman,et al.  Deep Learning Tools for Human Microbiome Big Data , 2016, SOFA.

[154]  S. Ramaswamy,et al.  Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. , 2002, Cancer research.

[155]  Verónica Bolón-Canedo,et al.  A review of feature selection methods on synthetic data , 2013, Knowledge and Information Systems.

[156]  Reynold Xin,et al.  Apache Spark , 2016 .

[157]  Huan Liu,et al.  Consistency-based search in feature selection , 2003, Artif. Intell..

[158]  Yong Luo,et al.  Vector-Valued Multi-View Semi-Supervsed Learning for Multi-Label Image Classification , 2013, AAAI.

[159]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[160]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[161]  Nirmal Kumar,et al.  A hybrid approach for gene selection and classification using support vector machine , 2015, Int. Arab J. Inf. Technol..

[162]  Mengjie Zhang,et al.  A binary ABC algorithm based on advanced similarity scheme for feature selection , 2015, Appl. Soft Comput..

[163]  Juan Monroy-de-Jesús,et al.  A Selective Dynamic Sampling Back-Propagation Approach for Handling the Two-Class Imbalance Problem , 2016 .

[164]  Huan Liu,et al.  Efficient Feature Selection via Analysis of Relevance and Redundancy , 2004, J. Mach. Learn. Res..

[165]  S. Hammond An overview of microRNAs. , 2015, Advanced drug delivery reviews.

[166]  Feiping Nie,et al.  Efficient semi-supervised feature selection with noise insensitive trace ratio criterion , 2013, Neurocomputing.

[167]  Danh V. Nguyen,et al.  Tumor classification by partial least squares using microarray gene expression data , 2002, Bioinform..

[168]  Tiratha Raj Singh,et al.  Computational studies on Alzheimer's disease associated pathways and regulatory patterns using microarray gene expression and network data: revealed association with aging and other diseases. , 2013, Journal of theoretical biology.

[169]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[170]  Abeer M. Mahmoud,et al.  A Hybrid Reduction Approach for Enhancing Cancer Classification of Microarray Data , 2014 .

[171]  Edward R. Dougherty,et al.  Small Sample Issues for Microarray-Based Classification , 2001, Comparative and functional genomics.

[172]  Huan Liu,et al.  Searching for Interacting Features , 2007, IJCAI.

[173]  A. Brazma,et al.  Gene expression data analysis , 2000, FEBS letters.

[174]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[175]  Zexuan Zhu,et al.  Markov blanket-embedded genetic algorithm for gene selection , 2007, Pattern Recognit..

[176]  Richard Weber,et al.  Simultaneous feature selection and classification using kernel-penalized support vector machines , 2011, Inf. Sci..

[177]  Anirban Mukherjee,et al.  Cancer Classification from Gene Expression Data by NPPC Ensemble , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[178]  K. Kadota,et al.  Detecting outlying samples in microarray data: A critical assessment of the effect of outliers on sample classification , 2003 .

[179]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[180]  Salvatore Petralia,et al.  Recent Advances in DNA Microarray Technology: an Overview on Production Strategies and Detection Methods , 2013 .

[181]  Ah Chung Tsoi,et al.  Face recognition: a convolutional neural-network approach , 1997, IEEE Trans. Neural Networks.

[182]  Yuming Zhou,et al.  Selecting feature subset for high dimensional data via the propositional FOIL rules , 2013, Pattern Recognit..

[183]  Slobodan Vucetic,et al.  Improving accuracy of microarray classification by a simple multi-task feature selection filter , 2011, Int. J. Data Min. Bioinform..

[184]  N Revathy,et al.  Accurate Cancer Classification Using Expressions of Very Few Genes , 2011 .

[185]  Wei Xie,et al.  Accurate Cancer Classification Using Expressions of Very Few Genes , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[186]  Kalpdrum Passi,et al.  Markov blanket: Efficient strategy for feature subset selection method for high dimensional microarray cancer datasets , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[187]  Byunghan Lee,et al.  Deep learning in bioinformatics , 2016, Briefings Bioinform..

[188]  Danh V. Nguyen,et al.  Multi-class cancer classification via partial least squares with gene expression profiles , 2002, Bioinform..

[189]  Sri Ramakrishna,et al.  An Efficient Statistical Model Based Classification Algorithm for Classifying Cancer Gene Expression Data with Minimal Gene Subsets , 2009 .

[190]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[191]  Francisco Herrera,et al.  An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..

[192]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[193]  Susmita Datta,et al.  Finding common genes in multiple cancer types through meta-analysis of microarray experiments: a rank aggregation approach. , 2008, Genomics.

[194]  Hugues Bersini,et al.  A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[195]  Ran El-Yaniv,et al.  Distributional Word Clusters vs. Words for Text Categorization , 2003, J. Mach. Learn. Res..

[196]  Sen Liang,et al.  A Review of Matched-pairs Feature Selection Methods for Gene Expression Data Analysis , 2018, Computational and structural biotechnology journal.

[197]  M. Mohammadi,et al.  Robust and stable gene selection via Maximum-Minimum Correntropy Criterion. , 2016, Genomics.

[198]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[199]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[200]  Swarup Roy,et al.  Big Data Analytics in Bioinformatics: A Machine Learning Perspective , 2015, ArXiv.

[201]  Feng Yang,et al.  Robust Feature Selection for Microarray Data Based on Multicriterion Fusion , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[202]  Eréndira Rendón,et al.  Data Sampling Methods to Deal With the Big Data Multi-Class Imbalance Problem , 2020, Applied Sciences.

[203]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[204]  Björn Olsson,et al.  Artificial intelligence techniques for bioinformatics. , 2002, Applied bioinformatics.

[205]  Verónica Bolón-Canedo,et al.  A review of feature selection methods in medical applications , 2019, Comput. Biol. Medicine.

[206]  Hui-Ling Huang,et al.  ESVM: Evolutionary support vector machine for automatic feature selection and classification of microarray data , 2007, Biosyst..

[207]  Francisco Herrera,et al.  Study on the Impact of Partition-Induced Dataset Shift on $k$-Fold Cross-Validation , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[208]  Yanqing Zhang,et al.  A genetic algorithm-based method for feature subset selection , 2008, Soft Comput..

[209]  Colas Schretter,et al.  Information-Theoretic Feature Selection in Microarray Data Using Variable Complementarity , 2008, IEEE Journal of Selected Topics in Signal Processing.

[210]  Jin Hyun Park,et al.  Gene selection and classification from microarray data using kernel machine , 2004, FEBS letters.

[211]  JIANPING LI,et al.  Feature Selection via Least Squares Support Feature Machine , 2007, Int. J. Inf. Technol. Decis. Mak..

[212]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[213]  Le Song,et al.  Feature Selection via Dependence Maximization , 2012, J. Mach. Learn. Res..

[214]  Javier De Las Rivas,et al.  Improving k-NN for Human Cancer Classification Using the Gene Expression Profiles , 2009, IDA.

[215]  Francisco Herrera,et al.  EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling , 2013, Pattern Recognit..

[216]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[217]  Daniel Svozil,et al.  Introduction to multi-layer feed-forward neural networks , 1997 .

[218]  J. Dev,et al.  A Classification Technique for Microarray Gene Expression Data using PSO-FLANN , 2012 .

[219]  Keun Ho Ryu,et al.  A New Direction of Cancer Classification: Positive Effect of Low-Ranking MicroRNAs , 2014, Osong public health and research perspectives.

[220]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[221]  Moshood A. Hambali,et al.  ADABOOST Ensemble Algorithms for Breast Cancer Classification , 2019 .

[222]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[223]  Nebojsa Jojic,et al.  Feature Selection Using Counting Grids: Application to Microarray Data , 2012, SSPR/SPR.

[224]  Xiaofeng Zhu,et al.  Graph self-representation method for unsupervised feature selection , 2017, Neurocomputing.

[225]  Michael I. Jordan,et al.  Simultaneous Relevant Feature Identification and Classification in High-Dimensional Spaces , 2002, WABI.

[226]  Pierre Baldi,et al.  Deep Spatio-Temporal Architectures and Learning for Protein Structure Prediction , 2012, NIPS.

[227]  Fadoua Rafii,et al.  New Approach for Microarray Data Decision Making with Respect to Multiple Sources , 2017, BDCA'17.

[228]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[229]  Verónica Bolón-Canedo,et al.  An ensemble of filters and classifiers for microarray data classification , 2012, Pattern Recognit..

[230]  Michael S. Lew,et al.  Deep learning for visual understanding: A review , 2016, Neurocomputing.

[231]  B. H. Shekar,et al.  L1-Regulated Feature Selection and Classification of Microarray Cancer Data Using Deep Learning , 2018, CVIP.

[232]  L. Augenlicht,et al.  Patterns of gene expression that characterize the colonic mucosa in patients at genetic risk for colonic cancer. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[233]  Tin Kam Ho,et al.  Complexity Measures of Supervised Classification Problems , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[234]  Barnali Sahu,et al.  A Novel Feature Selection Algorithm using Particle Swarm Optimization for Cancer Microarray Data , 2012 .