An SVM-Wrapped Multiobjective Evolutionary Feature Selection Approach for Identifying Cancer-MicroRNA Markers

MicroRNAs (miRNAs), have been shown to play important roles in gene regulation and various biological processes. Recent studies have revealed that abnormal expression of some specific miRNAs often results in the development of cancer. Microarray datasets containing the expression profiles of several miRNAs are being used for identification of miRNAs which are differentially expressed in normal and malignant tissue samples. In this article, a multiobjective feature selection approach is proposed for this purpose. The proposed method uses Genetic Algorithm for multiobjective optimization and support vector machine (SVM) classifier as a wrapper for evaluating the chromosomes that encode feature subsets. The performance has been demonstrated on real-life miRNA datasets for and the identified miRNA markers are reported. Moreover biological significance tests have been carried out for the obtained markers.

[1]  Douglas A. Wolfe,et al.  Nonparametric Statistical Methods , 1973 .

[2]  D. Wolfe,et al.  Nonparametric Statistical Methods. , 1974 .

[3]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[4]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[5]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[6]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[7]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[8]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[9]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[10]  David Corne,et al.  The Pareto archived evolution strategy: a new baseline algorithm for Pareto multiobjective optimisation , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[11]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Kalyanmoy Deb,et al.  A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.

[13]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[14]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[15]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[16]  M. Eisen,et al.  Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering , 2002, Genome Biology.

[17]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[18]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[19]  C. Perou,et al.  A custom microarray platform for analysis of microRNA gene expression , 2004, Nature Methods.

[20]  H. Horvitz,et al.  MicroRNA expression profiles classify human cancers , 2005, Nature.

[21]  Carlos A. Coello Coello,et al.  Evolutionary multi-objective optimization: a historical view of the field , 2006, IEEE Comput. Intell. Mag..

[22]  David L. Olson,et al.  Advanced Data Mining Techniques , 2008 .

[23]  S. Bandyopadhyay,et al.  Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes , 2009, BMC Bioinformatics.

[24]  Sanghamitra Bandyopadhyay,et al.  TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples , 2009, Bioinform..

[25]  Axel Benner,et al.  penalizedSVM: a R-package for feature selection SVM classification , 2009, Bioinform..

[26]  Sanghamitra Bandyopadhyay,et al.  Analyzing miRNA co-expression networks to explore TF-miRNA regulation , 2009, BMC Bioinformatics.

[27]  Ujjwal Maulik,et al.  Development of the human cancer microRNA network , 2010 .

[28]  Ujjwal Maulik,et al.  Multi-Class Clustering of Cancer Subtypes through SVM Based Ensemble of Pareto-Optimal Solutions for Gene Marker Identification , 2010, PloS one.

[29]  Carlos A. Coello Coello,et al.  Evolutionary multiobjective optimization , 2011, WIREs Data Mining Knowl. Discov..

[30]  Ujjwal Maulik,et al.  Multiobjective Genetic Algorithms for Clustering - Applications in Data Mining and Bioinformatics , 2011 .