An Ab-initio tree-based exploration to enhance sampling of low-energy protein conformations

This paper proposes a robotics-inspired method to enhance sampling of native-like protein conformations when employing only amino-acid sequence. Computing such conformations, essential to associate structural and functional information with gene sequences, is challenging due to the high-dimensionality and the rugged energy surface of the protein conformational space. The contribution of this work is a novel two-layered method to enhance the sampling of geometrically-distinct lowenergy conformations at a coarse-grained level of detail. The method grows a tree in conformational space reconciling two goals: (i) guiding the tree towards lower energies and (ii) not oversampling geometrically-similar conformations. Discretizations of the energy surface and a low-dimensional projection space are employed to select more often for expansion low-energy conformations in under-explored regions of the conformational space. The tree is expanded with low-energy conformations through a Metropolis Monte Carlo framework that uses a move set of physical fragment configurations. Testing on sequences of seven small-to-medium structurally-diverse proteins shows that the method rapidly samples native-like conformations in a few hours on a single CPU. Analysis shows that computed conformations are good candidates for further detailed energetic refinements by larger studies in protein engineering and design.

[1]  Anil K. Jain,et al.  Bootstrap Techniques for Error Estimation , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  David Hsu,et al.  Workspace-Based Connectivity Oracle: An Adaptive Sampling Strategy for PRM Planning , 2006, WAFR.

[3]  O. Brock,et al.  A methodology for efficiently sampling the conformation space of molecular structures , 2005, Physical Biology.

[4]  M. DePristo,et al.  Simultaneous determination of protein structure and dynamics , 2005, Nature.

[5]  K. Dill,et al.  From Levinthal to pathways to funnels , 1997, Nature Structural Biology.

[6]  Jean-Claude Latombe,et al.  Robot Motion Planning: A Distributed Representation Approach , 1991, Int. J. Robotics Res..

[7]  David A. Lee,et al.  Predicting protein function from sequence and structure , 2007, Nature Reviews Molecular Cell Biology.

[8]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[9]  M. Thorpe,et al.  Constrained geometric simulation of diffusive motion in proteins , 2005, Physical biology.

[10]  Thierry Siméon,et al.  A path planning approach for computing large-amplitude motions of flexible molecules , 2005, ISMB.

[11]  F. Ding,et al.  Ab initio folding of proteins with all-atom discrete molecular dynamics. , 2008, Structure.

[12]  Howie Choset,et al.  Principles of Robot Motion: Theory, Algorithms, and Implementation ERRATA!!!! 1 , 2007 .

[13]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[14]  Flavio Seno,et al.  Geometry and symmetry presculpt the free-energy landscape of proteins. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Tanja Kortemme,et al.  Computational design of protein-protein interactions. , 2004, Current opinion in chemical biology.

[16]  Cecilia Clementi,et al.  Coarse-grained models of protein folding: toy models or predictive tools? , 2008, Current opinion in structural biology.

[17]  L. Kavraki,et al.  Modeling protein conformational ensembles: From missing loops to equilibrium fluctuations , 2006, Proteins.

[18]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[19]  P. Wolynes,et al.  Water in protein structure prediction. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Jean-Claude Latombe,et al.  Stochastic roadmap simulation: an efficient representation and algorithm for analyzing molecular motion , 2002, RECOMB '02.

[21]  G. Chirikjian,et al.  Efficient generation of feasible pathways for protein conformational transitions. , 2002, Biophysical journal.

[22]  G. Rose,et al.  Building native protein conformation from highly approximate backbone torsion angles. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[23]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.

[24]  Nancy M. Amato,et al.  A Kinematics-Based Probabilistic Roadmap Method for Closed Chain Systems , 2001 .

[25]  Michele Vendruscolo,et al.  Rare fluctuations of native proteins sampled by equilibrium hydrogen exchange. , 2003, Journal of the American Chemical Society.

[26]  Nancy M. Amato,et al.  Using motion planning to map protein folding landscapes and analyze folding kinetics of known native structures , 2002, RECOMB '02.

[27]  Li Han Hybrid probabilistic RoadMap - Monte Carlo motion planning for closed chain systems with spherical joints , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[28]  Lydia E Kavraki,et al.  Low-dimensional, free-energy landscapes of protein-folding reactions by nonlinear dimensionality reduction , 2006, Proc. Natl. Acad. Sci. USA.

[29]  Mariusz Milik,et al.  Algorithm for rapid reconstruction of protein backbone from alpha carbon coordinates , 1997, J. Comput. Chem..

[30]  W. Graham Richards,et al.  Ultrafast shape recognition to search compound databases for similar molecular shapes , 2007, J. Comput. Chem..

[31]  Oliver Brock,et al.  Efficient Motion Planning Based on Disassembly , 2005, Robotics: Science and Systems.

[32]  Lydia E. Kavraki,et al.  Motion Planning in the Presence of Drift, Underactuation and Discrete System Changes , 2005, Robotics: Science and Systems.

[33]  Wilfred F van Gunsteren,et al.  Biomolecular modeling: Goals, problems, perspectives. , 2006, Angewandte Chemie.

[34]  Shuangye Yin,et al.  Eris: an automated estimator of protein stability , 2007, Nature Methods.

[35]  Jean-Claude Latombe,et al.  On Delaying Collision Checking in PRM Planning: Application to Multi-Robot Coordination , 2002, Int. J. Robotics Res..

[36]  TWO-WEEK Loan COpy,et al.  University of California , 1886, The American journal of dental science.

[37]  G. Rose,et al.  A backbone-based theory of protein folding , 2006, Proceedings of the National Academy of Sciences.

[38]  Lydia E. Kavraki,et al.  Discrete Search Leading Continuous Exploration for Kinodynamic Motion Planning , 2007, Robotics: Science and Systems.

[39]  L. Kavraki,et al.  Multiscale characterization of protein conformational ensembles , 2009, Proteins.