Colon cancer prediction with genetics profiles using evolutionary techniques

Microarray data provides information on gene expression levels of thousands of genes in a cell in a single experiment. DNA microarray is a powerful tool in the diagnosis of cancer. Numerous efforts have been made to use gene expression profiles to improve precision of tumor classification. In this study comparison between class prediction accuracy of two different classifiers, Genetic Programming and Genetically Evolved Decision Trees, was carried out using the best 10 and best 20 genes ranked by the t-statistic and mutual information. Genetic Programming proved out to be the better classifier for this dataset based on area under the receiver operating characteristic curve (AUC) and total accuracy using mutual information based feature selection. We conclude that Genetic Programming together with mutual information based feature selection is the most efficient alternative to the existing colon cancer prediction techniques.

[1]  K. Kinzler,et al.  Cancer genes and the pathways they control , 2004, Nature Medicine.

[2]  Jianping Li,et al.  A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue , 2007, Artif. Intell. Medicine.

[3]  Vadlamani Ravi,et al.  Colon cancer prediction with genetic profiles using intelligent techniques , 2008, Bioinformation.

[4]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[5]  Guandong Xu,et al.  Tumor tissue identification based on gene expression data using DWT feature extraction and PNN classifier , 2006, Neurocomputing.

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[8]  A. Lindblom,et al.  Colorectal carcinogenesis is associated with stromal expression of COL11A1 and COL5A2. , 2001, Carcinogenesis.

[9]  S. Shyue,et al.  Mitochondrial localization of cyclooxygenase-2 and calcium-independent phospholipase A2 in human cancer cells: implication in apoptosis resistance. , 2005, Experimental cell research.

[10]  Sung-Bae Cho,et al.  Prediction of colon cancer using an evolutionary neural network , 2004, Neurocomputing.

[11]  John R. Koza,et al.  Introduction to genetic programming , 1994, GECCO '07.

[12]  Matthew Walker Introduction to Genetic Programming , 2001 .

[13]  Huiqing Liu,et al.  A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. , 2002, Genome informatics. International Conference on Genome Informatics.

[14]  Dimitrios Kalles,et al.  Breeding Decision Trees Using Evolutionary Techniques , 2001, ICML.

[15]  S. Polak‐Charcon,et al.  Fibroblastic polyp of the colon: clinicopathological analysis of 10 cases with emphasis on its common association with serrated crypts , 2006, Histopathology.

[16]  Naoto Tsuchiya,et al.  Up-regulation of hnRNP A1 gene in sporadic human colorectal cancers. , 2005, International journal of oncology.

[17]  Nir Friedman,et al.  Tissue classification with gene expression profiles. , 2000 .

[18]  Yadong Wang,et al.  Constructing disease-specific gene networks using pair-wise relevance metric: Application to colon cancer identifies interleukin 8, desmin and enolase 1 as the central elements , 2008, BMC Systems Biology.

[19]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[20]  Kaushik Mahata,et al.  Selecting differentially expressed genes using minimum probability of classification error , 2007, J. Biomed. Informatics.

[21]  Danh V. Nguyen,et al.  Tumor classification by partial least squares using microarray gene expression data , 2002, Bioinform..