Simulated annealing for supervised gene selection

Genomic data, and more generally biomedical data, are often characterized by high dimensionality. An input selection procedure can attain the two objectives of highlighting the relevant variables (genes) and possibly improving classification results. In this paper, we propose a wrapper approach to gene selection in classification of gene expression data using simulated annealing along with supervised classification. The proposed approach can perform global combinatorial searches through the space of all possible input subsets, can handle cases with numerical, categorical or mixed inputs, and is able to find (sub-)optimal subsets of inputs giving low classification errors. The method has been tested on publicly available bioinformatics data sets using support vector machines and on a mixed type data set using classification trees. We also propose some heuristics able to speed up the convergence. The experimental results highlight the ability of the method to select minimal sets of relevant features.

[1]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Igor V. Tetko,et al.  Gene selection from microarray data for cancer classification - a machine learning approach , 2005, Comput. Biol. Chem..

[3]  Justin C. W. Debuse,et al.  Feature Subset Selection within a Simulated Annealing Data Mining Algorithm , 1997, Journal of Intelligent Information Systems.

[4]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Christopher M. Bishop,et al.  Classification and regression , 1997 .

[6]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[7]  D. Agrafiotis,et al.  Feature selection for structure-activity correlation using binary particle swarms. , 2002, Journal of medicinal chemistry.

[8]  Andrew S. Tanenbaum,et al.  Modern operating systems, 2nd Edition , 2001 .

[9]  D. Agrafiotis,et al.  Variable selection for QSAR by artificial ant colony systems , 2002, SAR and QSAR in environmental research.

[10]  Naftali Tishby,et al.  Agglomerative Information Bottleneck , 1999, NIPS.

[11]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[12]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[13]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[14]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[15]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 2000, Springer Berlin Heidelberg.

[16]  E. Barkai Aging in subdiffusion generated by a deterministic dynamical system. , 2003, Physical review letters.

[17]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[18]  Zbigniew Michalewicz,et al.  Genetic algorithms + data structures = evolution programs (3rd ed.) , 1996 .

[19]  William H. Press,et al.  Numerical Recipes in C, 2nd Edition , 1992 .

[20]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[21]  Francesco Masulli,et al.  Unsupervised Gene Selection and Clustering Using Simulated Annealing , 2005, WILF.

[22]  Peter C. Jurs,et al.  Automated Descriptor Selection for Quantitative Structure-Activity Relationships Using Generalized Simulated Annealing , 1995, J. Chem. Inf. Comput. Sci..

[23]  Lucila Ohno-Machado,et al.  An Epicurean learning approach to gene-expression data classification , 2003, Artif. Intell. Medicine.

[24]  Shinichi Morishita,et al.  On Classification and Regression , 1998, Discovery Science.

[25]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[26]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[27]  Hxugo Kubiny Variable Selection in QSAR Studies. I. An Evolutionary Algorithm , 1994 .

[28]  William H. Press,et al.  Numerical Recipes in Fortran 77: The Art of Scientific Computing 2nd Editionn - Volume 1 of Fortran Numerical Recipes , 1992 .

[29]  William H. Press,et al.  Numerical recipes in C , 2002 .

[30]  R. Leardi,et al.  Genetic algorithms applied to feature selection in PLS regression: how and when to use them , 1998 .

[31]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[32]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[33]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[34]  E. M. Wright,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[35]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[36]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[37]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[38]  Andrew S. Tanenbaum,et al.  Modern Operating Systems , 1992 .

[39]  John H. Kalivas,et al.  Global optimization by simulated annealing with wavelength selection for ultraviolet-visible spectrophotometry , 1989 .

[40]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[41]  Razvan Andonie,et al.  An Integrated Soft Computing Approach for Predicting Biological Activity of Potential HIV-1 Protease Inhibitors , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[42]  John H. Kalivas,et al.  Comparison of Forward Selection, Backward Elimination, and Generalized Simulated Annealing for Variable Selection , 1993 .

[43]  Francesco Masulli,et al.  Random Voronoi ensembles for gene selection , 2003, Neurocomputing.

[44]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[45]  梅村 恭司 Andrew S.Tanenbaum 著, "Operating systems, Design and implementation", PRENTICE-HALL, INC., Englewood Cliffs, B5変形判, 719p., \4,120 , 1988 .

[46]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[47]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[48]  Deborah Estrin,et al.  An evaluation of multi-resolution storage for sensor networks , 2003, SenSys '03.

[49]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[50]  M A Arnold,et al.  Genetic algorithm-based method for selecting wavelengths and model size for use with partial least-squares regression: application to near-infrared spectroscopy. , 1996, Analytical chemistry.

[51]  Rodolfo Zunino,et al.  Automated diagnosis and disease characterization using neural network analysis , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.

[52]  Zbigniew Michalewicz,et al.  Evolutionary algorithms , 1997, Emerging Evolutionary Algorithms for Antennas and Wireless Communications.

[53]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[54]  Riccardo Leardi,et al.  Genetic Algorithms as a Tool for Wavelength Selection in Multivariate Calibration , 1995 .

[55]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..