Recognition of Colorectal Carcinogenic Tissue with Gene Expression Analysis Using Bayesian Probability

According to the WHO research in 2008, colorectal cancer caused approximately 8% of all cancer deaths worldwide. Only particular set of genes is responsible for its occurrence. Their increased or decreased expression levels cause the cells in the colorectal region not to work properly, i.e. the processes they are associated with are disrupted. This research aims to unveil those genes and make a model which is going to determine whether one patient is carcinogenic. We propose a realistic modeling of the gene expression probability distribution and use it to calculate the Bayesian posterior probability for classification. We developed a new methodology for obtaining the best classification results. The gene expression profiling is done by using the DNA microarray technology. In this research, 24,526 genes were being monitored at carcinogenic and healthy tissues equally. We also used SVMs and Binary Decision Trees which resulted in very satisfying correctness.

[1]  Andrew J. Bulpitt,et al.  From gene expression to gene regulatory networks in Arabidopsis thaliana , 2009, BMC Systems Biology.

[2]  S. Ishii,et al.  Identification of expressed genes linked to malignancy of human colorectal carcinoma by parametric clustering of quantitative expression data , 2003, Genome Biology.

[3]  Zhijin Wu,et al.  Subset Quantile Normalization Using Negative Control Features , 2010, J. Comput. Biol..

[4]  A. Butte,et al.  Microarrays for an Integrative Genomics , 2002 .

[5]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[6]  Pan Du,et al.  lumi: a pipeline for processing Illumina microarray , 2008, Bioinform..

[7]  Lu Xie,et al.  DigOut: viewing differential expression genes as outliers. , 2010, Journal of bioinformatics and computational biology.

[8]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  S. Drăghici,et al.  Analysis of microarray experiments of gene expression profiling. , 2006, American journal of obstetrics and gynecology.

[10]  K. Chou,et al.  Identification of Colorectal Cancer Related Genes with mRMR and Shortest Path in Protein-Protein Interaction Network , 2012, PloS one.

[11]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[13]  N. Ackovska,et al.  New support vector machine-based approach over DNA chip data , 2008, 2008 International Conference on Innovations in Information Technology.

[14]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.