VoroCNN: Deep convolutional neural network built on 3D Voronoi tessellation of protein structures

Motivation Effective use of evolutionary information has recently led to tremendous progress in computational prediction of three-dimensional (3D) structures of proteins and their complexes. Despite the progress, the accuracy of predicted structures tends to vary considerably from case to case. Since the utility of computational models depends on their accuracy, reliable estimates of deviation between predicted and native structures are of utmost importance. Results For the first time we present a deep convolutional neural network (CNN) constructed on a Voronoi tessellation of 3D molecular structures. Despite the irregular data domain, our data representation allows to efficiently introduce both convolution and pooling operations of the network. We trained our model, called VoroCNN, to predict local qualities of 3D protein folds. The prediction results are competitive to the state of the art and superior to the previous 3D CNN architectures built for the same task. We also discuss practical applications of VoroCNN, for example, in the recognition of protein binding interfaces. Availability The model, data, and evaluation tests are available at https://team.inria.fr/nano-d/software/vorocnn/. Contact ceslovas.venclovas@bti.vu.lt, sergei.grudinin@inria.fr

[1]  Sergei Grudinin,et al.  Smooth orientation-dependent scoring function for coarse-grained protein quality assessment , 2018, Bioinform..

[2]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[3]  Krzysztof Fidelis,et al.  Processing and evaluation of predictions in CASP4 , 2001, Proteins.

[4]  David T. Jones,et al.  Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints , 2018, Nature Communications.

[5]  F. Richards The interpretation of protein structures: total volume, group volume distributions and packing density. , 1974, Journal of molecular biology.

[6]  Matteo Dal Peraro,et al.  A further leap of improvement in tertiary structure prediction in CASP13 prompts new routes for future assessments , 2019, Proteins.

[7]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP) — round x , 2014, Proteins.

[8]  Arne Elofsson,et al.  ProQ3: Improved model quality assessments using Rosetta energy terms , 2016, Scientific Reports.

[9]  Michael I. Jordan,et al.  Learning Spectral Clustering , 2003, NIPS.

[10]  Marco Biasini,et al.  lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests , 2013, Bioinform..

[11]  Wei Xu,et al.  CNN-RNN: A Unified Framework for Multi-label Image Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Jerome L. Myers,et al.  Research Design and Statistical Analysis , 1991 .

[14]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[15]  Kliment Olechnovic,et al.  Voronota: A fast and reliable tool for computing the vertices of the Voronoi diagram of atomic balls , 2014, J. Comput. Chem..

[16]  Vijayan K. Asari,et al.  The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches , 2018, ArXiv.

[17]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[18]  C Venclovas,et al.  Processing and analysis of CASP3 protein structure predictions , 1999, Proteins.

[19]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[20]  Sergei Grudinin,et al.  NOLB: Nonlinear Rigid Block Normal-Mode Analysis Method. , 2017, Journal of chemical theory and computation.

[21]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[22]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[23]  R. Fisher FREQUENCY DISTRIBUTION OF THE VALUES OF THE CORRELATION COEFFIENTS IN SAMPLES FROM AN INDEFINITELY LARGE POPU;ATION , 1915 .

[24]  Yang Zhang,et al.  Deep‐learning contact‐map guided protein structure prediction in CASP13 , 2019, Proteins.

[25]  Kliment Olechnovič,et al.  CAD‐score: A new contact area difference‐based function for evaluation of protein structural models , 2013, Proteins.

[26]  Arne Elofsson,et al.  GraphQA: protein model quality assessment using graph convolutional networks , 2020, Bioinform..

[27]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[28]  Jie Hou,et al.  DNCON2: improved protein contact prediction using two-level deep convolutional neural networks , 2017, bioRxiv.

[29]  K Fidelis,et al.  A large‐scale experiment to assess protein structure prediction methods , 1995, Proteins.

[30]  David T. Jones,et al.  High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features , 2018, Bioinform..

[31]  Renzhi Cao,et al.  Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13 , 2019, bioRxiv.

[32]  Yoshua Bengio,et al.  Deep convolutional networks for quality assessment of protein folds , 2018, Bioinform..

[33]  Yang Shen,et al.  Energy-based Graph Convolutional Networks for Scoring Protein Docking Models , 2019, bioRxiv.

[34]  Yuanliu Liu,et al.  Video-based emotion recognition using CNN-RNN and C3D hybrid networks , 2016, ICMI.

[35]  Jan Boehm,et al.  A review on deep learning techniques for 3D sensed data classification , 2019, Remote. Sens..

[36]  Sergei Grudinin,et al.  DeepSymmetry : Using 3D convolutional networks for identification of tandem repeats and internal symmetries in protein structures , 2018, Bioinform..

[37]  Minkyung Baek,et al.  Assessment of protein model structure accuracy estimation in CASP13: Challenges in the era of deep learning , 2019, Proteins.

[38]  Kliment Olechnovic,et al.  Comparative analysis of methods for evaluation of protein models against native structures , 2018, Bioinform..

[39]  A. Poupon Voronoi and Voronoi-related tessellations in studies of protein structure and interaction. , 2004, Current opinion in structural biology.

[40]  Guillaume Pagès,et al.  Protein model quality assessment using 3D oriented convolutional neural networks , 2018 .

[41]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[42]  Silvia Crivelli,et al.  Structural Learning of Proteins Using Graph Convolutional Neural Networks , 2019, bioRxiv.

[43]  Alex Fout,et al.  Protein Interface Prediction using Graph Convolutional Networks , 2017, NIPS.

[44]  Kliment Olechnovič,et al.  VoroMQA: Assessment of protein structure quality using interatomic contact areas , 2017, Proteins.

[45]  Pushmeet Kohli,et al.  Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13) , 2019, Proteins.

[46]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[47]  K. Wüthrich Protein structure determination in solution by NMR spectroscopy. , 1990, The Journal of biological chemistry.

[48]  Kliment Olechnovič,et al.  Contact Area-Based Structural Analysis of Proteins and Their Complexes Using CAD-Score. , 2020, Methods in molecular biology.

[49]  I. Peretz,et al.  Individual Differences in Rhythmic Cortical Entrainment Correlate with Predictive Behavior in Sensorimotor Synchronization , 2016, Scientific Reports.

[50]  Kliment Olechnovic,et al.  VoroMQA web server for assessing three-dimensional structures of proteins and protein complexes , 2019, Nucleic Acids Res..

[51]  Jinbo Xu,et al.  Analysis of distance-based protein structure prediction by deep learning in CASP13 , 2019, bioRxiv.

[52]  F M Richards,et al.  Areas, volumes, packing and protein structure. , 1977, Annual review of biophysics and bioengineering.

[53]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[54]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Arne Elofsson,et al.  Estimation of model accuracy in CASP13 , 2019, Proteins.

[56]  Ruoyu Li,et al.  Adaptive Graph Convolutional Neural Networks , 2018, AAAI.

[57]  J. Janin,et al.  Revisiting the Voronoi description of protein–protein interfaces , 2006, Protein science : a publication of the Protein Society.

[58]  Sergei Grudinin,et al.  Protein model quality assessment using 3D oriented convolutional neural networks , 2018, bioRxiv.

[59]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[60]  David Baker,et al.  ProteinGCN: Protein model quality assessment using Graph Convolutional Networks , 2020, bioRxiv.

[61]  S. Wodak,et al.  Deviations from standard atomic volumes as a quality measure for protein crystal structures. , 1996, Journal of molecular biology.

[62]  Demis Hassabis,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[63]  Jerome L. Myers,et al.  Research Design and Statistical Analysis: Third Edition , 1991 .

[64]  Ralf Zimmer,et al.  New scoring Schemes for Protein fold recognition based on Voronoi contacts , 1997, German Conference on Bioinformatics.

[65]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.