GraphQA: protein model quality assessment using graph convolutional networks

MOTIVATION Proteins are ubiquitous molecules whose function in biological processes is determined by their 3D structure. Experimental identification of a protein's structure can be time-consuming, prohibitively expensive, and not always possible. Alternatively, protein folding can be modeled using computational methods, which however are not guaranteed to always produce optimal results.GraphQA is a graph-based method to estimate the quality of protein models, that possesses favorable properties such as representation learning, explicit modeling of both sequential and 3D structure, geometric invariance, and computational efficiency. RESULTS GraphQA performs similarly to state-of-the-art methods despite using a relatively low number of input features. In addition, the graph network structure provides an improvement over the architecture used in ProQ4 operating on the same input features. Finally, the individual contributions of GraphQA components are carefully evaluated. AVAILABILITY AND IMPLEMENTATION PyTorch implementation, datasets, experiments, and link to an evaluation server are available through this GitHub repository: github.com/baldassarreFe/graphqa. SUPPLEMENTARY INFORMATION Supplementary material is available at Bioinformatics online.

[1]  Yoshua Bengio,et al.  Deep convolutional networks for quality assessment of protein folds , 2018, Bioinform..

[2]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[3]  Arne Elofsson,et al.  Identification of correct regions in protein models using structural, alignment, and consensus information , 2006, Protein science : a publication of the Protein Society.

[4]  Arne Elofsson,et al.  ProQ3D: improved model quality assessments using deep learning , 2016, Bioinform..

[5]  Svetlana Lazebnik,et al.  Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering , 2018, NeurIPS.

[6]  Yang Zhang,et al.  A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction , 2010, PloS one.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Kliment Olechnovič,et al.  CAD‐score: A new contact area difference‐based function for evaluation of protein structural models , 2013, Proteins.

[9]  Mohammed AlQuraishi End-to-end differentiable learning of protein structure , 2018, bioRxiv.

[10]  Arne Elofsson,et al.  Improved topology prediction using the terminal hydrophobic helices rule , 2016, Bioinform..

[11]  Ruoyu Li,et al.  Adaptive Graph Convolutional Neural Networks , 2018, AAAI.

[12]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[13]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[14]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[15]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[16]  Sergei Grudinin,et al.  Smooth orientation-dependent scoring function for coarse-grained protein quality assessment , 2018, Bioinform..

[17]  Jure Leskovec,et al.  Modeling polypharmacy side effects with graph convolutional networks , 2018, bioRxiv.

[18]  Jinbo Xu Distance-based protein folding powered by deep learning , 2019, Proceedings of the National Academy of Sciences.

[19]  Arne Elofsson,et al.  ProQ3: Improved model quality assessments using Rosetta energy terms , 2016, Scientific Reports.

[20]  Namrata Anand,et al.  Generative modeling for protein structures , 2018, NeurIPS.

[21]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[22]  Björn Wallner,et al.  Improved model quality assessment using ProQ2 , 2012, BMC Bioinformatics.

[23]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[24]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[25]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[26]  Anna Tramontano,et al.  Assessment of predictions in the model quality assessment category , 2007, Proteins.

[27]  Juergen Haas,et al.  The Protein Model Portal—a comprehensive resource for protein structure and model information , 2013, Database J. Biol. Databases Curation.

[28]  Regina Barzilay,et al.  Learning Multimodal Graph-to-Graph Translation for Molecular Optimization , 2018, ICLR.

[29]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[30]  Kliment Olechnovič,et al.  VoroMQA: Assessment of protein structure quality using interatomic contact areas , 2017, Proteins.

[31]  Debora S. Marks,et al.  Learning Protein Structure with a Differentiable Simulator , 2018, ICLR.

[32]  Renzhi Cao,et al.  Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13 , 2019, Proteins.

[33]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[34]  Alex Fout,et al.  Protein Interface Prediction using Graph Convolutional Networks , 2017, NIPS.

[35]  David T. Jones,et al.  High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features , 2018, Bioinform..

[36]  Sergei Grudinin,et al.  Protein model quality assessment using 3D oriented convolutional neural networks , 2018, bioRxiv.

[37]  Minkyung Baek,et al.  Assessment of protein model structure accuracy estimation in CASP13: Challenges in the era of deep learning , 2019, Proteins.

[38]  Ping Zhang,et al.  Interpretable Drug Target Prediction Using Deep Neural Representation , 2018, IJCAI.

[39]  T. Hubbard,et al.  Critical assessment of methods of protein structure prediction (CASP): Round III , 1999 .

[40]  Arne Elofsson,et al.  Estimation of model accuracy in CASP13 , 2019, Proteins.

[41]  Torsten Schwede,et al.  QMEANDisCo—distance constraints applied on model quality estimation , 2019, Bioinform..

[42]  B. Rost,et al.  Redefining the goals of protein secondary structure prediction. , 1994, Journal of molecular biology.

[43]  Zhen Li,et al.  Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model , 2016, bioRxiv.

[44]  A. Elofsson,et al.  Can correct protein models be identified? , 2003, Protein science : a publication of the Protein Society.

[45]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[46]  K Fidelis,et al.  A large‐scale experiment to assess protein structure prediction methods , 1995, Proteins.

[47]  Torsten Schwede,et al.  The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling , 2006, Bioinform..

[48]  Marco Biasini,et al.  lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests , 2013, Bioinform..

[49]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[50]  David Menéndez Hurtado,et al.  Improved protein model quality assessments by changing the target function , 2018, Proteins.

[51]  Liam J McGuffin,et al.  IntFOLD: an integrated web resource for high performance protein structure and function prediction , 2019, Nucleic Acids Res..

[52]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[53]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[54]  Dong Si,et al.  AngularQA: Protein Model Quality Assessment with LSTM Networks , 2019, bioRxiv.

[55]  Qi Liu,et al.  Constrained Graph Variational Autoencoders for Molecule Design , 2018, NeurIPS.

[56]  Arne Elofsson,et al.  Deep transfer learning in the assessment of the quality of protein models , 2018, 1804.06281.