Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines

Simultaneous multiclass classification of tumor types is essential for future clinical implementations of microarray‐based cancer diagnosis. In this study, we have combined genetic algorithms (GAs) and all paired support vector machines (SVMs) for multiclass cancer identification. The predictive features have been selected through iterative SVMs/GAs, and recursive feature elimination post‐processing steps, leading to a very compact cancer‐related predictive gene set. Leave‐one‐out cross‐validations yielded accuracies of 87.93% for the eight‐class and 85.19% for the fourteen‐class cancer classifications, outperforming the results derived from previously published methods.

[1]  Takeshi Iwamura,et al.  SERPINE2 (protease nexin I) promotes extracellular matrix production and local invasion of pancreatic tumors in vivo. , 2003, Cancer research.

[2]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[3]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[5]  Patrick Tan,et al.  Genetic algorithms applied to multi-class prediction for the analysis of gene expression data , 2003, Bioinform..

[6]  David E Fisher,et al.  Microphthalamia-associated transcription factor: a critical regulator of pigment cell development and survival , 2003, Oncogene.

[7]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[8]  J. Mesirov,et al.  Chemosensitivity prediction by transcriptional profiling , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Sayan Mukherjee,et al.  Molecular classification of multiple tumor types , 2001, ISMB.

[10]  Léon Personnaz,et al.  On Cross Validation for Model Selection , 1999, Neural Computation.

[11]  N. Sampas,et al.  Molecular classification of cutaneous malignant melanoma by gene expression profiling , 2000, Nature.

[12]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[14]  Jiawei Han,et al.  Cancer classification using gene expression data , 2003, Inf. Syst..

[15]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[16]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[17]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[18]  A. Levine,et al.  Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. , 2001, Combinatorial chemistry & high throughput screening.

[19]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[20]  Ash A. Alizadeh,et al.  The lymphochip: a specialized cDNA microarray for the genomic-scale analysis of gene expression in normal and malignant lymphocytes. , 1999, Cold Spring Harbor symposia on quantitative biology.

[21]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[22]  L. Sreerama,et al.  Cellular levels of class 1 and class 3 aldehyde dehydrogenases and certain other drug-metabolizing enzymes in human breast malignancies. , 1997, Clinical cancer research : an official journal of the American Association for Cancer Research.