Informative Gene Selection for Microarray Classification via Adaptive Elastic Net with Conditional Mutual Information

Due to the advantage of achieving a better performance under weak regularization, elastic net has attracted wide attention in statistics, machine learning, bioinformatics, and other fields. In particular, a variation of the elastic net, adaptive elastic net (AEN), integrates the adaptive grouping effect. In this paper, we aim to develop a new algorithm: Adaptive Elastic Net with Conditional Mutual Information (AEN-CMI) that further improves AEN by incorporating conditional mutual information into the gene selection process. We apply this new algorithm to screen significant genes for two kinds of cancers: colon cancer and leukemia. Compared with other algorithms including Support Vector Machine, Classic Elastic Net and Adaptive Elastic Net, the proposed algorithm, AEN-CMI, obtains the best classification performance using the least number of genes.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  Niels Richard Hansen,et al.  Sparse group lasso and high dimensional multinomial classification , 2012, Comput. Stat. Data Anal..

[3]  Yong Shi,et al.  ν-Nonparallel support vector machine for pattern classification , 2014, Neural Computing and Applications.

[4]  Jianjun Meng,et al.  Simultaneously Optimizing Spatial Spectral Features Based on Mutual Information for EEG Classification , 2015, IEEE Transactions on Biomedical Engineering.

[5]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[6]  Stephen J. McKenna,et al.  Classification and Immunohistochemical Scoring of Breast Tissue Microarray Spots , 2013, IEEE Transactions on Biomedical Engineering.

[7]  Noah Simon,et al.  A Sparse-Group Lasso , 2013 .

[8]  Georgios B. Giannakis,et al.  Online Adaptive Estimation of Sparse Signals: Where RLS Meets the $\ell_1$ -Norm , 2010, IEEE Transactions on Signal Processing.

[9]  Xing-Ming Zhao,et al.  Inferring gene regulatory networks from gene expression data by path consistency algorithm based on conditional mutual information , 2012, Bioinform..

[10]  Murat Saraclar,et al.  A Comparison of SVM and GMM-Based Classifier Configurations for Diagnostic Classification of Pulmonary Sounds , 2015, IEEE Transactions on Biomedical Engineering.

[11]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[12]  Yanwen Chong,et al.  Gene selection using independent variable group analysis for tumor classification , 2011, Neural Computing and Applications.

[13]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[14]  Yingmin Jia,et al.  Partly adaptive elastic net and its application to microarray classification , 2012, Neural Computing and Applications.

[15]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[16]  Yong Shi,et al.  Successive Overrelaxation for Laplacian Support Vector Machine , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Zhiwen Yu,et al.  Hybrid Adaptive Classifier Ensemble , 2015, IEEE Transactions on Cybernetics.

[18]  Kin-Man Lam,et al.  Microarray Data Classification Using the Spectral-Feature-Based TLS Ensemble Algorithm , 2014, IEEE Transactions on NanoBioscience.

[19]  Yong Xu,et al.  RPCA-Based Tumor Classification Using Gene Expression Data , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[20]  Josep Vehí,et al.  Detection of Correct and Incorrect Measurements in Real-Time Continuous Glucose Monitoring Systems by Applying a Postprocessing Support Vector Machine , 2013, IEEE Transactions on Biomedical Engineering.

[21]  Juntao Li,et al.  Weighted doubly regularized support vector machine and its application to microarray classification with noise , 2016, Neurocomputing.

[22]  Hao Helen Zhang,et al.  ON THE ADAPTIVE ELASTIC-NET WITH A DIVERGING NUMBER OF PARAMETERS. , 2009, Annals of statistics.

[23]  Ujjwal Maulik,et al.  Gene-Expression-Based Cancer Subtypes Prediction Through Feature Selection and Transductive SVM , 2013, IEEE Transactions on Biomedical Engineering.

[24]  Gavin C. Cawley,et al.  Gene Selection in Cancer Classification using Sparse Logistic Regression with Bayesian Regularisation , 2006 .

[25]  Leo Joskowicz,et al.  fMRI-Based Hierarchical SVM Model for the Classification and Grading of Liver Fibrosis , 2011, IEEE Transactions on Biomedical Engineering.

[26]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[27]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Deyuan Meng,et al.  Grouped Gene Selection of Cancer via Adaptive Sparse Group Lasso Based on Conditional Mutual Information , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  C CawleyGavin,et al.  Gene selection in cancer classification using sparse logistic regression with Bayesian regularization , 2006 .

[30]  R Ohno,et al.  The percentage of myeloperoxidase-positive blast cells is a strong independent prognostic factor in acute myeloid leukemia, even in the patients with normal karyotype , 2003, Leukemia.

[31]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[32]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[33]  Xiaoping Li,et al.  Weighted General Group Lasso for Gene Selection in Cancer Classification , 2019, IEEE Transactions on Cybernetics.

[34]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[35]  Graziano Pesole,et al.  Selection of relevant genes in cancer diagnosis based on their prediction accuracy , 2007, Artif. Intell. Medicine.

[36]  Anirban Mukherjee,et al.  Cancer Classification from Gene Expression Data by NPPC Ensemble , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[37]  Heng Huang,et al.  Lung Nodule Classification With Multilevel Patch-Based Context Analysis , 2014, IEEE Transactions on Biomedical Engineering.