Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking

Protein complex formation is a central problem in biology, being involved in most of the cell’s processes, and essential for applications, e.g. drug design or protein engineering. We tackle rigid body protein-protein docking, i.e., computationally predicting the 3D structure of a protein-protein complex from the individual unbound structures, assuming no conformational change within the proteins happens during binding. We design a novel pairwise-independent SE(3)-equivariant graph matching network to predict the rotation and translation to place one of the proteins at the right docked position relative to the second protein. We mathematically guarantee a basic principle: the predicted complex is always identical regardless of the initial locations and orientations of the two structures. Our model, named EQUIDOCK, approximates the binding pockets and predicts the docking poses using keypoint matching and alignment, achieved through optimal transport and a differentiable Kabsch algorithm. Empirically, we achieve significant running time improvements and often outperform existing docking software despite not relying on heavy candidate sampling, structure refinement, or templates.

[1]  Fabian B. Fuchs,et al.  SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks , 2020, NeurIPS.

[2]  Jianyi Yang,et al.  Improved protein structure prediction using predicted interresidue orientations , 2020, Proceedings of the National Academy of Sciences.

[3]  Dima Kozakov,et al.  Performance and Its Limits in Rigid Body Protein-Protein Docking. , 2020, Structure.

[4]  Alex Fout,et al.  Protein Interface Prediction using Graph Convolutional Networks , 2017, NIPS.

[5]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[6]  Daisuke Kihara,et al.  Protein-protein docking using region-based 3D Zernike descriptors , 2009, BMC Bioinformatics.

[7]  J. Christopher Fromme,et al.  Structures of core eukaryotic protein complexes , 2021, bioRxiv.

[8]  Andrew Gordon Wilson,et al.  Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data , 2020, ICML.

[9]  Mieczyslaw Torchala,et al.  SwarmDock: a server for flexible protein-protein docking , 2013, Bioinform..

[10]  Dan Li,et al.  HawkDock: a web server to predict and analyze the protein–protein complex based on computational docking and MM/GBSA , 2019, Nucleic Acids Res..

[11]  Maurice Weiler,et al.  General E(2)-Equivariant Steerable CNNs , 2019, NeurIPS.

[12]  Li Li,et al.  Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds , 2018, ArXiv.

[13]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[14]  Aleksey A. Porollo,et al.  Survey of public domain software for docking simulations and virtual screening , 2011, Human Genomics.

[15]  J. Pei,et al.  Human mitochondrial protein complexes revealed by large-scale coevolution analysis and deep learning-based structure modeling , 2021, bioRxiv.

[16]  Cristian Sminchisescu,et al.  Matrix Backpropagation for Deep Networks with Structured Layers , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[18]  Z. Weng,et al.  ZDOCK: An initial‐stage protein‐docking algorithm , 2003, Proteins.

[19]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[20]  Jingxiao Bao,et al.  DeepBSP - a Machine Learning Method for Accurate Prediction of Protein-Ligand Docking Structures , 2021, J. Chem. Inf. Model..

[21]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[22]  Regina Barzilay,et al.  Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design , 2021, ArXiv.

[23]  Pushmeet Kohli,et al.  Graph Matching Networks for Learning the Similarity of Graph Structured Objects , 2019, ICML.

[24]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[25]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[26]  Tie-Yan Liu,et al.  CopulaNet: Learning residue co-evolution directly from multiple sequence alignment for protein structure prediction , 2020, Nature Communications.

[27]  Pei Zhou,et al.  HDOCK: a web server for protein–protein and protein–DNA/RNA docking based on a hybrid strategy , 2017, Nucleic Acids Res..

[28]  Raphael A. G. Chaleil,et al.  Updates to the Integrated Protein-Protein Interaction Benchmarks: Docking Benchmark Version 5 and Affinity Benchmark Version 2. , 2015, Journal of molecular biology.

[29]  Ilya A Vakser,et al.  Protein-protein docking: from interaction to interactome. , 2014, Biophysical journal.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Mieczyslaw Torchala,et al.  The scoring of poses in protein-protein docking: current capabilities and future directions , 2013, BMC Bioinformatics.

[32]  Isaure Chauvot de Beauchêne,et al.  A web interface for easy flexible protein-protein docking with ATTRACT. , 2015, Biophysical journal.

[33]  Izhar Wallach,et al.  AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery , 2015, ArXiv.

[34]  Ruth Nussinov,et al.  PatchDock and SymmDock: servers for rigid and symmetric docking , 2005, Nucleic Acids Res..

[35]  Max Welling,et al.  E(n) Equivariant Graph Neural Networks , 2021, ICML.

[36]  Yatao Bian,et al.  SE(3)-Equivariant Energy-based Models for End-to-End Protein Folding , 2021, bioRxiv.

[37]  Rishi Bedi,et al.  End-to-End Learning on 3D Protein Structure for Interface Prediction , 2019, NeurIPS.

[38]  Xiaoqin Zou,et al.  An iterative knowledge‐based scoring function for protein–protein recognition , 2008, Proteins.

[39]  M. Bronstein,et al.  Fast end-to-end learning on protein surfaces , 2020, bioRxiv.

[40]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[41]  Jinbo Xu,et al.  Deep graph learning of inter-protein contacts , 2021, bioRxiv.

[42]  Takanori Hayashi,et al.  Evaluation of CONSRANK-Like Scoring Functions for Rescoring Ensembles of Protein–Protein Docking Poses , 2020, Frontiers in Molecular Biosciences.

[43]  Dejing Dou,et al.  Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity , 2021, KDD.

[44]  Gyu Rie Lee,et al.  Accurate prediction of protein structures and interactions using a 3-track neural network , 2021, Science.

[45]  Demis Hassabis,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[46]  Badri Adhikari,et al.  CONFOLD2: improved contact-driven ab initio protein structure modeling , 2018, BMC Bioinformatics.

[47]  David Ryan Koes,et al.  Lessons Learned in Empirical Scoring with smina from the CSAR 2011 Benchmarking Exercise , 2013, J. Chem. Inf. Model..

[48]  Dima Kozakov,et al.  The ClusPro web server for protein–protein docking , 2017, Nature Protocols.

[49]  M. Sanner,et al.  Reduced surface: an efficient way to compute molecular surfaces. , 1996, Biopolymers.

[50]  Bowen Jing,et al.  Hierarchical, rotation-equivariant neural networks to predict the structure of protein complexes , 2020, Proteins.

[51]  Zeyu Wen,et al.  Addressing recent docking challenges: A hybrid strategy to integrate template‐based and free protein‐protein docking , 2017, Proteins.

[52]  Regina Barzilay,et al.  GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles , 2021, NeurIPS.

[53]  Regina Barzilay,et al.  Generative Models for Graph-Based Protein Design , 2019, DGS@ICLR.

[54]  Ruth Nussinov,et al.  An integrated suite of fast docking algorithms , 2010, Proteins.

[55]  A. Elofsson,et al.  Improved prediction of protein-protein interactions using AlphaFold2 , 2021, Nature Communications.

[56]  Björn Wallner,et al.  DockQ: A Quality Measure for Protein-Protein Docking Models , 2016, PloS one.

[57]  Protein complex prediction with AlphaFold-Multimer , 2021, bioRxiv.

[58]  Manolis I. A. Lourakis,et al.  Estimating the Jacobian of the Singular Value Decomposition: Theory and Applications , 2000, ECCV.

[59]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[60]  David Ryan Koes,et al.  GNINA 1.0: molecular docking with deep learning , 2021, Journal of Cheminformatics.

[61]  Nicolas Courty,et al.  POT: Python Optimal Transport , 2021, J. Mach. Learn. Res..

[62]  Yee Whye Teh,et al.  LieTransformer: Equivariant self-attention for Lie Groups , 2020, ICML.

[63]  Shuiwang Ji,et al.  Deep Learning of High-Order Interactions for Protein Interface Prediction , 2020, KDD.

[64]  A. Bonvin,et al.  The HADDOCK web server for data-driven biomolecular docking , 2010, Nature Protocols.

[65]  D. Kihara,et al.  Benchmarking of structure refinement methods for protein complex models , 2021, Proteins.

[66]  Max Welling,et al.  Steerable CNNs , 2016, ICLR.

[67]  Improved Docking of Protein Models by a Combination of Alphafold2 and ClusPro , 2021 .

[68]  Chris Bailey-Kellogg,et al.  Protein interaction interface region prediction by geometric deep learning , 2021, Bioinform..

[69]  Charles W Christoffer,et al.  LZerD webserver for pairwise and multiple protein–protein docking , 2021, Nucleic Acids Res..

[70]  Xiaoqin Zou,et al.  A knowledge-based scoring function for protein-RNA interactions derived from a statistical mechanics-based iterative method , 2014, Nucleic acids research.

[71]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[72]  P. B. Jayaraj,et al.  FPDock: Protein-protein docking using flower pollination algorithm , 2021, Comput. Biol. Chem..

[73]  Isaure Chauvot de Beauchêne,et al.  Protein‐protein and peptide‐protein docking and refinement using ATTRACT in CAPRI , 2017, Proteins.

[74]  Danilo Jimenez Rezende,et al.  Equivariant Hamiltonian Flows , 2019, ArXiv.

[75]  Jiahua He,et al.  The HDOCK server for integrated protein–protein docking , 2020, Nature Protocols.