Improving representations of genomic sequence motifs in convolutional networks with exponential activations

Deep convolutional neural networks (CNNs) trained on regulatory genomic sequences tend to build representations in a distributed manner, making it a challenge to extract learned features that are biologically meaningful, such as sequence motifs. Here we perform a comprehensive analysis on synthetic sequences to investigate the role that CNN activations have on model interpretability. We show that employing an exponential activation to first layer filters consistently leads to interpretable and robust representations of motifs compared to other commonly used activations. Strikingly, we demonstrate that CNNs with better test performance do not necessarily imply more interpretable representations with attribution methods. We find that CNNs with exponential activations significantly improve the efficacy of recovering biologically meaningful representations with attribution methods. We demonstrate these results generalise to real DNA sequences across several in vivo datasets. Together, this work demonstrates how a small modification to existing CNNs, i.e. setting exponential activations in the first layer, can significantly improve the robustness and interpretabilty of learned representations directly in convolutional filters and indirectly with attribution methods.

[1]  Peter K. Koo,et al.  ZBED2 is an antagonist of interferon regulatory factor 1 and modifies cell identity in pancreatic cancer , 2020, Proceedings of the National Academy of Sciences.

[2]  David R. Kelley,et al.  Sequential regulatory activity prediction across chromosomes with convolutional neural networks. , 2018, Genome research.

[3]  Surya Ganguli,et al.  On the Expressive Power of Deep Neural Networks , 2016, ICML.

[4]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[5]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[6]  Pascal Vincent,et al.  Visualizing Higher-Layer Features of a Deep Network , 2009 .

[7]  Been Kim,et al.  Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values , 2018, ICLR.

[8]  David R. Kelley,et al.  Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks , 2015, bioRxiv.

[9]  T. D. Schneider,et al.  Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. , 1982, Nucleic acids research.

[10]  C. Vakoc,et al.  ZBED2 is an antagonist of Interferon Regulatory Factor 1 and modulates cell identity in pancreatic cancer , 2019, bioRxiv.

[11]  John J. Hopfield,et al.  Dense Associative Memory for Pattern Recognition , 2016, NIPS.

[12]  David K. Gifford,et al.  Visualizing complex feature interactions and feature sharing in genomic deep neural networks , 2019, BMC Bioinformatics.

[13]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[14]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[15]  Sean R. Eddy,et al.  Inferring Sequence-Structure Preferences of RNA-Binding Proteins with Convolutional Residual Networks , 2018, bioRxiv.

[16]  Leon Sixt,et al.  When Explanations Lie: Why Modified BP Attribution Fails , 2019, ArXiv.

[17]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[18]  Sean R. Eddy,et al.  Representation learning of genomic sequence motifs with convolutional neural networks , 2018, bioRxiv.

[19]  M. Bulyk,et al.  Transcription factor-DNA binding: beyond binding site motifs. , 2017, Current opinion in genetics & development.

[20]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Klaus-Robert Müller,et al.  Learning how to explain neural networks: PatternNet and PatternAttribution , 2017, ICLR.

[22]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[23]  Yoshua Bengio,et al.  Série Scientifique Scientific Series Incorporating Second-order Functional Knowledge for Better Option Pricing Incorporating Second-order Functional Knowledge for Better Option Pricing , 2022 .

[24]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[25]  John N. Weinstein,et al.  ElemCor: accurate data analysis and enrichment calculation for high-resolution LC-MS stable isotope labeling experiments , 2019, BMC Bioinformatics.

[26]  Pascal Sturmfels,et al.  Explaining Explanations: Axiomatic Feature Interactions for Deep Networks , 2020, J. Mach. Learn. Res..

[27]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[28]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[29]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[30]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[31]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[32]  David Baker,et al.  Protein sequence design by explicit energy landscape optimization , 2020, bioRxiv.

[33]  R. Gordân,et al.  Protein–DNA binding: complexities and multi-protein codes , 2013, Nucleic acids research.

[34]  Justin B. Kinney,et al.  Logomaker: beautiful sequence logos in Python , 2019, bioRxiv.

[35]  R. Gronostajski,et al.  Nuclear Factor One X in Development and Disease. , 2019, Trends in cell biology.

[36]  David Baker,et al.  De novo protein design by deep network hallucination , 2020, Nature.

[37]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[38]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[39]  Peter K. Koo,et al.  Robust Neural Networks are More Interpretable for Genomics , 2019, bioRxiv.

[40]  David J. Arenillas,et al.  JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles , 2015, Nucleic Acids Res..

[41]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[42]  Anshul Kundaje,et al.  Discovering epistatic feature interactions from neural network models of regulatory DNA sequences , 2018, bioRxiv.

[43]  Abubakar Abid,et al.  Interpretation of Neural Networks is Fragile , 2017, AAAI.

[44]  Georg Seelig,et al.  A Deep Neural Network for Predicting and Engineering Alternative Polyadenylation , 2019, Cell.

[45]  Martin Wattenberg,et al.  Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[46]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[47]  Tommi S. Jaakkola,et al.  On the Robustness of Interpretability Methods , 2018, ArXiv.

[48]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[49]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[50]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[51]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[52]  William Stafford Noble,et al.  Quantifying similarity between motifs , 2007, Genome Biology.

[53]  Anupama Jha,et al.  Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study , 2020, Genome Biology.

[54]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[55]  S. Mostafavi,et al.  Learning immune cell differentiation , 2019, bioRxiv.

[56]  D. Arnosti,et al.  Information display by transcriptional enhancers , 2003, Development.

[57]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[58]  Been Kim,et al.  Sanity Checks for Saliency Maps , 2018, NeurIPS.

[59]  Yanjun Qi,et al.  Deep Motif: Visualizing Genomic Sequence Classifications , 2016, ArXiv.

[60]  M. Hill,et al.  The emerging roles of TCF4 in disease and development. , 2014, Trends in molecular medicine.

[61]  Carola-Bibiane Schönlieb,et al.  On the Connection Between Adversarial Robustness and Saliency Map Interpretability , 2019, ICML.

[62]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[63]  Rafael A. Irizarry,et al.  Interpretable Convolution Methods for Learning Genomic Sequence Motifs , 2018, bioRxiv.

[64]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[65]  Luke Zappia,et al.  Opportunities and challenges in long-read sequencing data analysis , 2020, Genome Biology.

[66]  Aleksander Madry,et al.  Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[67]  David G. Knowles,et al.  Predicting Splicing from Primary Sequence with Deep Learning , 2019, Cell.

[68]  Surya Ganguli,et al.  Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice , 2017, NIPS.

[69]  Peter K Koo,et al.  Interpreting Deep Neural Networks Beyond Attribution Methods: Quantifying Global Importance of Genomic Features , 2020, bioRxiv.

[70]  Donald Geman,et al.  The Limits of De Novo DNA Motif Discovery , 2012, PloS one.

[71]  Matthew Slattery,et al.  Absence of a simple code: how transcription factors read the genome. , 2014, Trends in biochemical sciences.

[72]  A. Jolma,et al.  A protein activity assay to measure global transcription factor activity reveals determinants of chromatin accessibility , 2018, Nature Biotechnology.

[73]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.