Gene expression profiles based Human cancer diseases classification

Cancers are a large family of diseases that involve abnormal cell growth with the potential to spread to other parts of the body. A cancer disease in any of its forms represents a major cause of death worldwide. In cancer diagnosis, classification of different tumor types is of the greatest significance. Accuracy for prediction of various tumor types gives better treatment and minimization of toxicity on patients. Accordingly, creating methodologies that can effectively differentiate between cancer subtypes is essential. This paper presents a new methodology to classify Human cancer diseases based on the gene expression profiles. The proposed methodology combines both Information gain (IG) and Deep Genetic Algorithm (DGA). It first uses IG for feature selection, then uses Genetic Algorithm (GA) for feature reduction and finally uses Genetic Programming (GP) for cancer types' classification. The proposed system is evaluated by classifying cancer diseases in seven cancer datasets and the results are compared with most recent approaches.

[1]  Ghada Hany Badr,et al.  Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification , 2015, Comput. Biol. Chem..

[2]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[3]  Dimitrios I. Fotiadis,et al.  Machine learning applications in cancer prognosis and prediction , 2014, Computational and structural biotechnology journal.

[4]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[5]  T. Williamson,et al.  Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy , 2015, Comput. Medical Imaging Graph..

[6]  K. Premalatha,et al.  Performance Analysis of Genetic Algorithm with kNN and SVM for Feature Selection in Tumor Classification , 2014 .

[7]  Li-Yeh Chuang,et al.  Evaluation of Breast Cancer Susceptibility Using Improved Genetic Algorithms to Generate Genotype SNP Barcodes , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Saeid Nahavandi,et al.  Hidden Markov models for cancer classification using gene expression profiles , 2015, Inf. Sci..

[9]  Basabi Chakraborty,et al.  Multi-objective Optimization Using Pareto GA for Gene-Selection from Microarray Data for Disease Classification , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[10]  Wei-Gang Hu,et al.  Identification of a 12-Gene Signature for Lung Cancer Prognosis through Machine Learning , 2011 .

[11]  M. Tyers,et al.  Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. , 2002, Cancer research.

[12]  Yi Li,et al.  A Load Balancing Algorithm Based on Maximum Entropy Methods in Homogeneous Clusters , 2014, Entropy.

[13]  Hala M. Alshamlan,et al.  A Study of Cancer Microarray Gene Expression Profile : Objectives and Approaches , 2013 .

[14]  Li-Yeh Chuang,et al.  IG-GA: A Hybrid Filter/Wrapper Method for Feature Selection of Microarray Data , 2010 .

[15]  M. Punithavalli,et al.  A comparative analysis of feature selection algorithms on classification of gene microarray dataset , 2013, 2013 International Conference on Information Communication and Embedded Systems (ICICES).

[16]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[17]  Jun Ni,et al.  Mining and Integrating Reliable Decision Rules for Imbalanced Cancer Gene Expression Data Sets , 2012 .

[18]  Aparna Choudhary,et al.  Survey on Hybrid Approach for Feature Selection , 2014 .

[19]  Aik Choon Tan,et al.  Ensemble machine learning on gene expression data for cancer classification. , 2003, Applied bioinformatics.

[20]  R. M. Luque-Baena,et al.  Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data , 2014, Theoretical Biology and Medical Modelling.

[21]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[22]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[23]  Mehdi Khashei,et al.  A fuzzy intelligent approach to the classification problem in gene expression data analysis , 2012, Knowl. Based Syst..

[24]  T. Aruldoss Albert Victoire,et al.  Design of fuzzy expert system for microarray data classification using a novel Genetic Swarm Algorithm , 2012, Expert Syst. Appl..

[25]  Zili Zhang,et al.  A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data , 2010, BMC Bioinformatics.

[26]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[27]  Dong-Ling Tong,et al.  Hybrid genetic algorithm-neural network: Feature extraction for unpreprocessed microarray data , 2011, Artif. Intell. Medicine.

[28]  Jonathan M. Garibaldi,et al.  Using Rule-Based Machine Learning for Candidate Disease Gene Prioritization and Sample Classification of Cancer Gene Expression Data , 2012, PloS one.

[29]  Hamid H. Jebur,et al.  Machine Learning Techniques for Anomaly Detection: An Overview , 2013 .

[30]  Rafael Marcos Luque Baena,et al.  Analysis of Cancer Microarray Data using Constructive Neural Networks and Genetic Algorithms , 2013, IWBBIO.

[31]  Supoj Hengpraprohm GA-Based Classifier with SNR Weighted Features for Cancer Microarray Data Classification , 2013, SiPS 2013.

[32]  Hussein Hijazi,et al.  A classification framework applied to cancer gene expression profiles. , 2013, Journal of healthcare engineering.

[33]  Kevin Y Yip,et al.  Systematic exploration of autonomous modules in noisy microRNA-target networks for testing the generality of the ceRNA hypothesis , 2014, BMC Genomics.

[34]  A. Floren,et al.  ' " ' " ' " . " ' " " " " " ' " ' " " " " " : ' " 1 , 2001 .

[35]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[36]  A. Perallos,et al.  Crossover versus Mutation: A Comparative Analysis of the Evolutionary Strategy of Genetic Algorithms Applied to Combinatorial Optimization Problems , 2014, TheScientificWorldJournal.

[37]  Safaai Deris,et al.  Toward Integrated Clinical and Gene- Expression Profiles For Breast Cancer Prognosis: A Review Paper , 2009 .

[38]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[39]  David S. Wishart,et al.  Applications of Machine Learning in Cancer Prediction and Prognosis , 2006, Cancer informatics.

[40]  Tom Leinster,et al.  A Characterization of Entropy in Terms of Information Loss , 2011, Entropy.

[41]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[42]  Jin-Kao Hao,et al.  A hybrid LDA and genetic algorithm for gene selection and classification of microarray data , 2010, Neurocomputing.

[43]  G. M. Naik,et al.  Survey of Microarray Data Processing for Cancer Sub- Classification , 2014 .

[44]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.