Protein docking refinement by convex underestimation in the low-dimensional subspace of encounter complexes

We propose a novel stochastic global optimization algorithm with applications to the refinement stage of protein docking prediction methods. Our approach can process conformations sampled from multiple clusters, each roughly corresponding to a different binding energy funnel. These clusters are obtained using a density-based clustering method. In each cluster, we identify a smooth “permissive” subspace which avoids high-energy barriers and then underestimate the binding energy function using general convex polynomials in this subspace. We use the underestimator to bias sampling towards its global minimum. Sampling and subspace underestimation are repeated several times and the conformations sampled at the last iteration form a refined ensemble. We report computational results on a comprehensive benchmark of 224 protein complexes, establishing that our refined ensemble significantly improves the quality of the conformations of the original set given to the algorithm. We also devise a method to enhance the ensemble from which near-native models are selected.

[1]  Raphael A. G. Chaleil,et al.  Updates to the Integrated Protein-Protein Interaction Benchmarks: Docking Benchmark Version 5 and Affinity Benchmark Version 2. , 2015, Journal of molecular biology.

[2]  Charles DeLisi,et al.  Protein‐protein recognition: exploring the energy funnels near the binding sites , 1999, Proteins.

[3]  Dima Kozakov,et al.  Optimal clustering for detecting near-native conformations in protein docking. , 2005, Biophysical journal.

[4]  J. Onuchic,et al.  Funnels, pathways, and the energy landscape of protein folding: A synthesis , 1994, Proteins.

[5]  S Vajda,et al.  Free energy landscapes of encounter complexes in protein-protein association. , 1999, Biophysical journal.

[6]  N. O. Manning,et al.  The protein data bank , 1999, Genetica.

[7]  Dima Kozakov,et al.  Rigid Body Energy Minimization on Manifolds for Molecular Docking. , 2012, Journal of chemical theory and computation.

[8]  I. Vakser,et al.  How common is the funnel‐like energy landscape in protein‐protein interactions? , 2001, Protein science : a publication of the Protein Society.

[9]  J. Janin Assessing predictions of protein–protein interaction: The CAPRI experiment , 2005, Protein science : a publication of the Protein Society.

[10]  Jeffrey J. Gray,et al.  Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. , 2003, Journal of molecular biology.

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  K. Dill Polymer principles and protein folding , 1999, Protein science : a publication of the Protein Society.

[13]  Dima Kozakov,et al.  46 Encounter complexes and dimensionality reduction in protein-protein association , 2015, Journal of biomolecular structure & dynamics.

[14]  Dima Kozakov,et al.  Energy Minimization on Manifolds for Docking Flexible Molecules. , 2015, Journal of chemical theory and computation.

[15]  Paul A Bates,et al.  A machine learning approach for ranking clusters of docked protein‐protein complexes by pairwise cluster comparison , 2017, Proteins.

[16]  H A Scheraga,et al.  Reaching the global minimum in docking simulations: a Monte Carlo energy minimization approach using Bezier splines. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[17]  G. Marius Clore,et al.  Detecting transient intermediates in macromolecular binding by paramagnetic NMR , 2006, Nature.

[18]  J. Onuchic,et al.  Protein folding funnels: a kinetic approach to the sequence-structure relationship. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Björn Wallner,et al.  DockQ: A Quality Measure for Protein-Protein Docking Models , 2016, PloS one.

[20]  Stephen R Comeau,et al.  DARS (Decoys As the Reference State) potentials for protein-protein docking. , 2008, Biophysical journal.

[21]  Chaok Seok,et al.  GalaxyRefineComplex: Refinement of protein-protein complex model structures driven by interface repacking , 2016, Scientific Reports.

[22]  Amir Ali Ahmadi,et al.  A Complete Characterization of the Gap between Convexity and SOS-Convexity , 2011, SIAM J. Optim..

[23]  Yangyu Huang,et al.  A novel protocol for three-dimensional structure prediction of RNA-protein complexes , 2013, Scientific Reports.

[24]  R. Nussinov,et al.  Folding funnels, binding funnels, and protein function , 1999, Protein science : a publication of the Protein Society.

[25]  Ioannis Ch. Paschalidis,et al.  Focused grid‐based resampling for protein docking and mapping , 2016, J. Comput. Chem..

[26]  B. Borchers A C library for semidefinite programming , 1999 .

[27]  D. Baker,et al.  An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. , 2003, Journal of molecular biology.

[28]  Dima Kozakov,et al.  The ClusPro web server for protein–protein docking , 2017, Nature Protocols.

[29]  B. Borchers CSDP, A C library for semidefinite programming , 1999 .

[30]  Ioannis Ch. Paschalidis,et al.  SDU: A Semidefinite Programming-Based Underestimation Method for Stochastic Global Optimization in Protein Docking , 2007, IEEE Transactions on Automatic Control.

[31]  Zhiping Weng,et al.  IRaPPA: information retrieval based integration of biophysical models for protein assembly selection , 2017, Bioinform..

[32]  J A McCammon,et al.  Theory of biomolecular recognition. , 1998, Current opinion in structural biology.

[33]  Roland L. Dunbrack,et al.  A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. , 2011, Structure.

[34]  Stephen R. Comeau,et al.  PIPER: An FFT‐based protein docking program with pairwise potentials , 2006, Proteins.

[35]  Gideon Schreiber,et al.  New insights into the mechanism of protein–protein association , 2001, Proteins.

[36]  Ioannis Ch. Paschalidis,et al.  A Subspace Semi-Definite programming-based Underestimation (SSDU) method for stochastic global optimization in protein docking , 2014, 53rd IEEE Conference on Decision and Control.

[37]  John N. Tsitsiklis,et al.  NP-hardness of deciding convexity of quartic polynomials and related problems , 2010, Math. Program..

[38]  Ruth Nussinov,et al.  FireDock: Fast interaction refinement in molecular docking , 2007, Proteins.

[39]  M. Karplus,et al.  A Comprehensive Analytical Treatment of Continuum Electrostatics , 1996 .

[40]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[41]  G. Clore Visualizing lowly-populated regions of the free energy landscape of macromolecular complexes by paramagnetic relaxation enhancement. , 2008, Molecular bioSystems.

[42]  Ioannis Ch. Paschalidis,et al.  Protein Docking by the Underestimation of Free Energy Funnels in the Space of Encounter Complexes , 2008, PLoS Comput. Biol..

[43]  J. B. Rosen,et al.  Convex Global Underestimation for Molecular Structure Prediction , 2001 .

[44]  Ioannis Ch. Paschalidis,et al.  The Impact of Side-Chain Packing on Protein Docking Refinement , 2015, J. Chem. Inf. Model..

[45]  S Vajda,et al.  Kinetics of desolvation-mediated protein-protein binding. , 2000, Biophysical journal.

[46]  Zhiping Weng,et al.  ZRANK: Reranking protein docking predictions with an optimized energy function , 2007, Proteins.

[47]  G Marius Clore,et al.  Mechanistic details of a protein–protein association pathway revealed by paramagnetic relaxation enhancement titration measurements , 2010, Proceedings of the National Academy of Sciences.

[48]  Ioannis Ch. Paschalidis,et al.  A new distributed algorithm for side-chain positioning in the process of protein docking , 2013, 52nd IEEE Conference on Decision and Control.