Microarray Classification and Gene Selection with FS-NEAT

The analysis of microarrays has the potential to identify and predict diseases predisposition, such as cancer, opening a new path to better diagnosis and improved treatments. Additionally, microarrays can help to find genetic biomarkers, which are genes whose expressions are related to a specific disease stage or condition. But due to the huge number of genes present in microarray experiments, and the small number of available samples, computational methods that deal with such techniques need to overcome difficulties in both classification and feature selection tasks. This paper presents adaptations for the use of FS-NEAT, an evolutionary algorithm that creates and optimizes neural networks through genetic algorithms, as a tool that can satisfactorily perform both tasks simultaneously and automatically. The method is tested with a Leukemia dataset containing six imbalanced classes, compared with other classifiers, and the selected genes are biologically validated.

[1]  M. Nóbrega,et al.  Regulation of MEIS1 by Distal Enhancer Elements in Acute Leukemia , 2013, Leukemia.

[2]  Ying Sun,et al.  CT-721, a Potent Bcr-Abl Inhibitor, Exhibits Excellent In Vitro and In Vivo Efficacy in the Treatment of Chronic Myeloid Leukemia , 2017, Journal of Cancer.

[3]  Jianyu Miao,et al.  A Survey on Feature Selection , 2016 .

[4]  Bingya Liu,et al.  A hydrophobic residue in the TALE homeodomain of PBX1 promotes epithelial-to-mesenchymal transition of gastric carcinoma , 2017, Oncotarget.

[5]  Saijuan Chen,et al.  Enhanced Fructose Utilization Mediated by SLC2A5 Is a Unique Metabolic Feature of Acute Myeloid Leukemia with Therapeutic Potential. , 2016, Cancer Cell.

[6]  A Guiseppi-Elie,et al.  New developments in microarray technology. , 2001, Current opinion in biotechnology.

[7]  Jill P. Mesirov,et al.  Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data , 2003, Machine Learning.

[8]  Mohamed F. Ghalwash,et al.  Minimum redundancy maximum relevance feature selection approach for temporal gene expression data , 2017, BMC Bioinformatics.

[9]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[10]  Constantin F. Aliferis,et al.  A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification , 2008, BMC Bioinformatics.

[11]  M. Cleary,et al.  E2A-PBX1 Remodels Oncogenic Signaling Networks in B-cell Precursor Acute Lymphoid Leukemia. , 2016, Cancer research.

[12]  Shahram Rahimi,et al.  NeuroEvolutionary Feature Selection Using NEAT , 2014 .

[13]  A. Alsadeq,et al.  Acute lymphoblastic leukemia of the central nervous system: on the role of PBX1 , 2017, Haematologica.

[14]  Shimon Whiteson,et al.  Automatic feature selection using FS-NEAT , 2008 .

[15]  Risto Miikkulainen,et al.  Automatic feature selection in neuroevolution , 2005, GECCO '05.

[16]  P. Zawierucha,et al.  Microarray-based detection and expression analysis of ABC and SLC transporters in drug-resistant ovarian cancer cell lines. , 2013, Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie.

[17]  Michel Verleysen,et al.  The Curse of Dimensionality in Data Mining and Time Series Prediction , 2005, IWANN.

[18]  A. Bleyer,et al.  Clinical and molecular features of papillary thyroid cancer in adolescents and young adults , 2011, Cancer.

[19]  Jack Y. Yang,et al.  A comparative study of different machine learning methods on microarray gene expression data , 2008, BMC Genomics.

[20]  Rudi Deklerck,et al.  Automated feature selection in neuroevolution , 2009, Evol. Intell..

[21]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[22]  Raman Arora,et al.  Understanding Deep Neural Networks with Rectified Linear Units , 2016, Electron. Colloquium Comput. Complex..

[23]  C. Bloomfield,et al.  Expression analyses identify MLL as a prominent target of 11q23 amplification and support an etiologic role for MLL gain of function in myeloid malignancies. , 2004, Blood.

[24]  B. Aronow,et al.  MEIS1 regulates an HLF-oxidative stress axis in MLL-fusion gene leukemia. , 2015, Blood.

[25]  Bart Jansen,et al.  A comparison between FS-NEAT and FD-NEAT and an investigation of different initial topologies for a classification task with irrelevant features , 2016, 2016 IEEE Symposium Series on Computational Intelligence (SSCI).

[26]  Yaning Yang,et al.  Microarray expression profiling: analysis and applications. , 2003, Current opinion in drug discovery & development.

[27]  Beatriz A. Garro,et al.  Classification of DNA microarrays using artificial neural networks and ABC algorithm , 2016, Appl. Soft Comput..

[28]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[29]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  D. Cavalieri,et al.  Fundamentals of cDNA microarray data analysis. , 2003, Trends in genetics : TIG.

[31]  Randal S. Olson,et al.  Evolutionary computation: the next major transition of artificial intelligence? , 2017, BioData Mining.

[32]  Selma Ayse Ozel,et al.  A comparative study on the effect of feature selection on classification accuracy , 2012 .

[33]  Mintu Pal,et al.  Predictive and prognostic biomarkers in colorectal cancer: A systematic review of recent advances and challenges. , 2017, Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie.

[34]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[35]  V. Ojetti,et al.  Microarray analysis in gastric cancer: a review. , 2014, World journal of gastroenterology.

[36]  Yu Xue,et al.  A hybrid feature selection algorithm for gene expression data classification , 2017, Neurocomputing.

[37]  Shahram Rahimi,et al.  Optimized feature selection using NeuroEvolution of Augmenting Topologies (NEAT) , 2013, 2013 Joint IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS).

[38]  Rafael Marcos Luque Baena,et al.  Analysis of Cancer Microarray Data using Constructive Neural Networks and Genetic Algorithms , 2013, IWBBIO.

[39]  M. Červinka,et al.  Prognostic relevance of angiopoietin-2, fibroblast growth factor-2 and endoglin mRNA expressions in chronic lymphocytic leukemia. , 2014, Neoplasma.

[40]  Márcio Dorn,et al.  NEAT-FLEX: Predicting the conformational flexibility of amino acids using neuroevolution of augmenting topologies , 2017, J. Bioinform. Comput. Biol..

[41]  H. Kantarjian,et al.  CD105 (endoglin) is highly overexpressed in a subset of cases of acute myeloid leukemias. , 2013, American journal of clinical pathology.

[42]  Youfu Wang,et al.  Tumoral NKG2D alters cell cycle of acute myeloid leukemic cells and reduces NK cell-mediated immune surveillance , 2016, Immunologic Research.

[43]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[44]  M. Blumenberg Skinomics: past, present and future for diagnostic microarray studies in dermatology , 2013, Expert review of molecular diagnostics.

[45]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[46]  T. Dawson,et al.  c-Abl and Parkinson’s Disease: Mechanisms and Therapeutic Potential , 2017, Journal of Parkinson's disease.

[47]  Gregg B Whitworth,et al.  An introduction to microarray data analysis and visualization. , 2010, Methods in enzymology.

[48]  Marco Beccuti,et al.  Peculiar Genes Selection: A new features selection method to improve classification performances in imbalanced data sets , 2017, PloS one.

[49]  Zhimin Tan,et al.  Propofol enhances BCR-ABL TKIs’ inhibitory effects in chronic myeloid leukemia through Akt/mTOR suppression , 2017, BMC Anesthesiology.

[50]  J. Downing,et al.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. , 2002, Cancer cell.

[51]  Bart Jansen,et al.  An investigation of topological choices in FS-NEAT and FD-NEAT on XOR-based problems of increased complexity , 2017, GECCO.

[52]  L. M. Miller Molecular Profiling for Breast Cancer: A Comprehensive Review , 2013 .

[53]  C. Epstein,et al.  Microarray technology - enhanced versatility, persistent challenge. , 2000, Current opinion in biotechnology.

[54]  Hui Li,et al.  Evolutionary artificial neural networks: a review , 2011, Artificial Intelligence Review.

[55]  Mustafa Ozen,et al.  Artificial Neural Network Analysis of DNA Microarray-based Prostate Cancer Recurrence , 2005, 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[56]  Anne West,et al.  Computational Strategies for Analyzing Data in Gene Expression Microarray Experiments , 2003, J. Bioinform. Comput. Biol..

[57]  Habibollah Haron,et al.  Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[58]  Bensu Karahalil,et al.  Overview of Systems Biology and Omics Technologies. , 2016, Current medicinal chemistry.

[59]  Beatriz A. Garro,et al.  Designing artificial neural networks using differential evolution for classifying DNA microarrays , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[60]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.