Interpreting Deep Neural Networks Beyond Attribution Methods: Quantifying Global Importance of Genomic Features

Despite deep neural networks (DNNs) having found great success at improving performance on various prediction tasks in computational genomics, it remains difficult to understand why they make any given prediction. In genomics, the main approaches to interpret a high-performing DNN are to visualize learned representations via weight visualizations and attribution methods. While these methods can be informative, each has strong limitations. For instance, attribution methods only uncover the independent contribution of single nucleotide variants in a given sequence. Here we discuss and argue for global importance analysis which can quantify population-level importance of putative features and their interactions learned by a DNN. We highlight recent work that has benefited from this interpretability approach and then discuss connections between global importance analysis and causality.

[1]  Alexandra E. Fish,et al.  Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties , 2018, PLoS Comput. Biol..

[2]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[3]  Benny Chor,et al.  A deep neural network approach for learning intrinsic protein‐RNA binding preferences , 2018, Bioinform..

[4]  Peter K. Koo,et al.  Improving Convolutional Network Interpretability with Exponential Activations , 2019, bioRxiv.

[5]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[6]  Rafael A. Irizarry,et al.  Interpretable Convolution Methods for Learning Genomic Sequence Motifs , 2018, bioRxiv.

[7]  Charles E. McAnany,et al.  Deep learning at base-resolution reveals motif syntax of the cis-regulatory code , 2019, bioRxiv.

[8]  Fabian J Theis,et al.  Deep learning: new computational modelling techniques for genomics , 2019, Nature Reviews Genetics.

[9]  W. E,et al.  DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants , 2018, Nucleic acids research.

[10]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[11]  Xiaohui S. Xie,et al.  DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences , 2015, bioRxiv.

[12]  May D. Wang,et al.  DeeperBind: Enhancing Prediction of Sequence Specificities of DNA Binding Proteins , 2016, bioRxiv.

[13]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[14]  Peter K. Koo,et al.  Robust Neural Networks are More Interpretable for Genomics , 2019, bioRxiv.

[15]  Brendan J. Frey,et al.  A compendium of RNA-binding motifs for decoding gene regulation , 2013, Nature.

[16]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Christopher Y. Park,et al.  Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk , 2019, Nature Genetics.

[18]  Uwe Ohler,et al.  Deep neural networks for interpreting RNA-binding protein target preferences , 2019, bioRxiv.

[19]  Brendan J. Frey,et al.  cDeepbind: A context sensitive deep learning model of RNA-protein binding , 2018, bioRxiv.

[20]  Judea Pearl,et al.  The Do-Calculus Revisited , 2012, UAI.

[21]  David K. Gifford,et al.  Convolutional neural network architectures for predicting DNA–protein binding , 2016, Bioinform..

[22]  O. Stegle,et al.  DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning , 2016, Genome Biology.

[23]  O. Stegle,et al.  Deep learning for computational biology , 2016, Molecular systems biology.

[24]  David R. Kelley,et al.  Sequential regulatory activity prediction across chromosomes with convolutional neural networks. , 2018, Genome research.

[25]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[26]  Sean R. Eddy,et al.  Inferring Sequence-Structure Preferences of RNA-Binding Proteins with Convolutional Residual Networks , 2018, bioRxiv.

[27]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[28]  Scott M. Lundberg,et al.  DeepATAC: A deep-learning method to predict regulatory factor binding activity from ATAC-seq signals , 2017, bioRxiv.

[29]  David R. Kelley,et al.  Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks , 2015, bioRxiv.

[30]  May D. Wang,et al.  DeeperBind: Enhancing Prediction of Sequence Specificities of DNA Binding Proteins , 2017 .

[31]  Vineeth N. Balasubramanian,et al.  Neural Network Attributions: A Causal Perspective , 2019, ICML.

[32]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[33]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[34]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[35]  Tommi S. Jaakkola,et al.  On the Robustness of Interpretability Methods , 2018, ArXiv.

[36]  N. Jojic,et al.  Deep learning of the regulatory grammar of yeast 5′ untranslated regions from 500,000 random sequences , 2017, bioRxiv.

[37]  Sean R. Eddy,et al.  Representation learning of genomic sequence motifs with convolutional neural networks , 2018, bioRxiv.

[38]  Charles E. McAnany,et al.  Deep learning at base-resolution reveals cis-regulatory motif syntax , 2020 .

[39]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.