A Novel Multiclass Classification Method with Gene Expression Programming

Classification is one of the fundamental tasks of data mining, and many machine learning algorithms are inherently designed for binary (two-class) decision problems. Gene expression programming (GEP) is a genotype/phenotype genetic algorithm that evolves computer programs of different sizes and shapes (expression trees) encoded in linear chromosomes of fixed length. In this paper, we propose a novel method for multiclass classification by using GEP, a new hybrid of genetic algorithms (GAs) and genetic programming (GP). Different to the common method of formulating a multiclass classification problem as multiple two-class problems, we construct a novel multiclass classification by using eigenvalue centroid of each class and eigenvalue-power function. Experimental results on two real data sets demonstrate that method is able to achieve a preferable solution.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  Danh V. Nguyen,et al.  Tumor classification by partial least squares using microarray gene expression data , 2002, Bioinform..

[3]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Heitor Silvério Lopes,et al.  GEPCLASS: A Classification Rule Discovery Tool Using Gene Expression Programming , 2006, ADMA.

[5]  Danh V. Nguyen,et al.  Multi-class cancer classification via partial least squares with gene expression profiles , 2002, Bioinform..

[6]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[7]  Weimin Xiao,et al.  Evolving accurate and compact classification rules with gene expression programming , 2003, IEEE Trans. Evol. Comput..

[8]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[10]  Roger E Bumgarner,et al.  Correction: Multiclass classification of microarray data with repeated measurements: application to cancer , 2006, Genome Biology.

[11]  Cândida Ferreira,et al.  Gene Expression Programming: A New Adaptive Algorithm for Solving Problems , 2001, Complex Syst..

[12]  Cândida Ferreira Gene Expression Programming in Problem Solving , 2002 .

[13]  Johannes Fürnkranz,et al.  Round Robin Rule Learning , 2001, ICML.

[14]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[15]  Adrian E. Raftery,et al.  Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data , 2005, Bioinform..

[16]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .