Exploring Neural Networks and Related Visualization Techniques in Gene Expression Data

Over the past decade, neural networks have become one of the cutting-edge methods in various research fields, outshining specifically in complex classification problems. In this paper, we propose two main contributions: first, we conduct a methodological study of neural network modeling for classifying biological traits based on structured gene expression data. Then, we suggest an innovative approach for utilizing deep learning visualization techniques in order to reveal the specific genes important for the correct classification of each trait within the trained models. Our data suggests that this approach have great potential for becoming a standard feature importance tool used in complex medical research problems, and that it can further be generalized to various structured data classification problems outside the biological domain.

[1]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[2]  Byunghan Lee,et al.  Deep learning in bioinformatics , 2016, Briefings Bioinform..

[3]  Kwanjeera Wanichthanarak,et al.  Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision Medicine , 2018, Omics : a journal of integrative biology.

[4]  Tianwei Yu,et al.  A Deep Neural Network Model using Random Forest to Extract Feature Representation for Gene Expression Data Classification , 2018, Scientific Reports.

[5]  Manolis Kellis,et al.  Common Genetic Variants Modulate Pathogen-Sensing Responses in Human Dendritic Cells , 2014, Science.

[6]  De‐Zhu Li,et al.  Genetic structure and differentiation in Dendrocalamus sinicus (Poaceae: Bambusoideae) populations provide insight into evolutionary history and speciation of woody bamboos , 2018, Scientific Reports.

[7]  Philomin Juliana,et al.  A Benchmarking Between Deep Learning, Support Vector Machine and Bayesian Threshold Best Linear Unbiased Prediction for Predicting Ordinal Traits in Plant Breeding , 2018, G3: Genes, Genomes, Genetics.

[8]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[9]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[10]  Yan Guo,et al.  Architectures and accuracy of artificial neural network for disease classification from omics data , 2019, BMC Genomics.

[11]  Hossein Baharvand,et al.  DDX3Y, a Male-Specific Region of Y Chromosome Gene, May Modulate Neuronal Differentiation. , 2015, Journal of proteome research.

[12]  Hung-Wen Chiu,et al.  Risk classification of cancer survival using ANN with gene expression data from multiple laboratories , 2014, Comput. Biol. Medicine.

[13]  A. Iwasaki,et al.  Toll-like receptor control of the adaptive immune responses , 2004, Nature Immunology.

[14]  Fabian J Theis,et al.  Deep learning: new computational modelling techniques for genomics , 2019, Nature Reviews Genetics.

[15]  Brendan J. Frey,et al.  Machine Learning in Genomic Medicine: A Review of Computational Problems and Data Sets , 2016, Proceedings of the IEEE.

[16]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[17]  Q. Zou,et al.  Deep learning in omics: a survey and guideline , 2018, Briefings in functional genomics.

[18]  G. de los Campos,et al.  Can Deep Learning Improve Genomic Prediction of Complex Human Traits? , 2018, Genetics.

[19]  Leopold Parts,et al.  Computational biology: deep learning , 2017, Emerging topics in life sciences.

[20]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[21]  Ashok Kumar Dwivedi Artificial neural network model for effective cancer classification using microarray gene expression data , 2018, Neural Computing and Applications.

[22]  O. Stegle,et al.  Deep learning for computational biology , 2016, Molecular systems biology.