GraphProt2: A novel deep learning-based method for predicting binding sites of RNA-binding proteins

CLIP-seq is the state-of-the-art technique to experimentally determine transcriptome-wide binding sites of RNA-binding proteins (RBPs). However, it relies on gene expression which can be highly variable between conditions, and thus cannot provide a complete picture of the RBP binding landscape. This necessitates the use of computational methods to predict missing binding sites. Here we present GraphProt2, a computational RBP binding site prediction method based on graph convolutional neural networks (GCN). In contrast to current CNN methods, GraphProt2 supports variable length input as well as the possibility to accurately predict nucleotide-wise binding profiles. We demonstrate its superior performance compared to GraphProt and a CNN-based method on single as well as combined CLIP-seq datasets.

[1]  Rolf Backofen,et al.  Computational analysis of CLIP-seq data. , 2017, Methods.

[2]  Howard Y. Chang,et al.  irCLIP platform for efficient characterization of protein–RNA interactions , 2016, Nature Methods.

[3]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[4]  Andrew D. Smith,et al.  Site identification in high-throughput RNA-protein interaction data , 2012, Bioinform..

[5]  Raquel Almeida,et al.  RNA-Binding Proteins in Cancer: Old Players and New Actors. , 2017, Trends in cancer.

[6]  Alessandro Sperduti,et al.  Universal Readout for Graph Convolutional Neural Networks , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[7]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Scott B. Dewell,et al.  Transcriptome-wide Identification of RNA-Binding Protein and MicroRNA Target Sites by PAR-CLIP , 2010, Cell.

[9]  Jernej Ule,et al.  hiCLIP reveals the in vivo atlas of mRNA secondary structures recognized by Staufen 1 , 2015, Nature.

[10]  Matthias W. Hentze,et al.  A brave new world of RNA-binding proteins , 2018, Nature Reviews Molecular Cell Biology.

[11]  Junchi Yan,et al.  Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks , 2017, BMC Genomics.

[12]  Alfonso Valencia,et al.  APPRIS: annotation of principal and alternative splice isoforms , 2012, Nucleic Acids Res..

[13]  A. Chinnaiyan,et al.  The lncRNA landscape of breast cancer reveals a role for DSCAM-AS1 in breast cancer progression , 2016, Nature Communications.

[14]  Alessandro Sperduti,et al.  Supervised neural networks for the classification of structures , 1997, IEEE Trans. Neural Networks.

[15]  R. Backofen,et al.  GraphProt: modeling binding preferences of RNA-binding proteins , 2014, Genome Biology.

[16]  Annalisa Marsico,et al.  PureCLIP: capturing target-specific protein–RNA interaction footprints from single-nucleotide CLIP-seq data , 2017, Genome Biology.

[17]  Xiaohua Shen,et al.  Insight into novel RNA-binding activities via large-scale analysis of lncRNA-bound proteome and IDH1-bound transcriptome , 2019, Nucleic acids research.

[18]  J. Manley,et al.  RNA-binding proteins in neurodegeneration: mechanisms in aggregate , 2017, Genes & development.

[19]  Alessandro Sperduti,et al.  Pre-training Graph Neural Networks with Kernels , 2018, ArXiv.

[20]  Gene W. Yeo,et al.  Robust transcriptome-wide discovery of RNA binding protein binding sites with enhanced CLIP (eCLIP) , 2016, Nature Methods.

[21]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[22]  Thomas Tuschl,et al.  Evolutionary conservation and expression of human RNA-binding proteins and their role in human genetic disease. , 2014, Advances in experimental medicine and biology.

[23]  D. Scholtens,et al.  Lineage-specific splicing of a brain-enriched alternative exon promotes glioblastoma progression. , 2014, The Journal of clinical investigation.

[24]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[25]  Alessandro Sperduti,et al.  On Filter Size in Graph Convolutional Networks , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[26]  J. Ule,et al.  iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution , 2010, Nature Structural &Molecular Biology.

[27]  Gene W. Yeo,et al.  Rbfox proteins regulate alternative mRNA splicing through evolutionarily conserved RNA bridges , 2013, Nature Structural &Molecular Biology.

[28]  Gene W. Yeo,et al.  SONAR Discovers RNA-Binding Proteins from Analysis of Large-Scale Protein-Protein Interactomes. , 2016, Molecular cell.

[29]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[30]  S. Gerstberger,et al.  A census of human RNA-binding proteins , 2014, Nature Reviews Genetics.

[31]  J. Michael Cherry,et al.  ENCODE data at the ENCODE portal , 2015, Nucleic Acids Res..

[32]  E. Jankowsky,et al.  Specificity and nonspecificity in RNA–protein interactions , 2015, Nature Reviews Molecular Cell Biology.

[33]  Rolf Backofen,et al.  MechRNA: prediction of lncRNA mechanisms from RNA–RNA and RNA–protein interactions , 2018, bioRxiv.

[34]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[35]  Tyson A. Clark,et al.  HITS-CLIP yields genome-wide insights into brain alternative RNA processing , 2008, Nature.

[36]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[37]  Peter H. Sudmant,et al.  RNA Sequence Context Effects Measured In Vitro Predict In Vivo Protein Binding and Regulation. , 2016, Molecular cell.

[38]  F. Pauler,et al.  Long non-coding RNAs display higher natural expression variation than protein-coding genes in healthy humans , 2016, Genome Biology.

[39]  Quaid Morris,et al.  RNAcontext: A New Method for Learning the Sequence and Structure Binding Preferences of RNA-Binding Proteins , 2010, PLoS Comput. Biol..

[40]  Hong-Bin Shen,et al.  Recent methodology progress of deep learning for RNA–protein interaction prediction , 2019, Wiley interdisciplinary reviews. RNA.