Adaptive local learning in sampling based motion planning for protein folding

BackgroundSimulating protein folding motions is an important problem in computational biology. Motion planning algorithms, such as Probabilistic Roadmap Methods, have been successful in modeling the folding landscape. Probabilistic Roadmap Methods and variants contain several phases (i.e., sampling, connection, and path extraction). Most of the time is spent in the connection phase and selecting which variant to employ is a difficult task. Global machine learning has been applied to the connection phase but is inefficient in situations with varying topology, such as those typical of folding landscapes.ResultsWe develop a local learning algorithm that exploits the past performance of methods within the neighborhood of the current connection attempts as a basis for learning. It is sensitive not only to different types of landscapes but also to differing regions in the landscape itself, removing the need to explicitly partition the landscape. We perform experiments on 23 proteins of varying secondary structure makeup with 52–114 residues. We compare the success rate when using our methods and other methods. We demonstrate a clear need for learning (i.e., only learning methods were able to validate against all available experimental data) and show that local learning is superior to global learning producing, in many cases, significantly higher quality results than the other methods.ConclusionsWe present an algorithm that uses local learning to select appropriate connection methods in the context of roadmap construction for protein folding. Our method removes the burden of deciding which method to use, leverages the strengths of the individual input methods, and it is extendable to include other future connection methods.

[1]  Steven M. LaValle,et al.  Improving Motion-Planning Algorithms by Efficient Nearest-Neighbor Searching , 2007, IEEE Transactions on Robotics.

[2]  Leland Mayne,et al.  Hydrogen Exchange Mass Spectrometry. , 2016, Methods in enzymology.

[3]  P. Wolynes,et al.  Spin glasses and the statistical mechanics of protein folding. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Vijay S. Pande,et al.  Folding@Home and Genome@Home: Using distributed computing to tackle previously intractable problem , 2009, 0901.0866.

[5]  T. Wales,et al.  Hydrogen exchange mass spectrometry for the analysis of protein dynamics. , 2006, Mass spectrometry reviews.

[6]  Sam Ade Jacobs,et al.  Adaptive neighbor connection for PRMs: A natural fit for heterogeneous environments and parallelism , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  A. M. B. DOUGLAS,et al.  X-Ray Crystallography , 1947, Nature.

[8]  Jean-Claude Latombe,et al.  Stochastic roadmap simulation: an efficient representation and algorithm for analyzing molecular motion , 2002, RECOMB '02.

[9]  Lydia E. Kavraki,et al.  Quantitative Analysis of Nearest-Neighbors Search in High-Dimensional Sampling-Based Motion Planning , 2006, WAFR.

[10]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[11]  Lydia Tapia,et al.  Simulating Protein Motions with Rigidity Analysis , 2006, RECOMB.

[12]  A. Fink,et al.  Fluorescence as a method to reveal structures and membrane-interactions of amyloidogenic proteins. , 2007, Biochimica et biophysica acta.

[13]  Sebastian Thrun,et al.  Apprenticeship learning for motion planning with application to parking lot navigation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  L Serrano,et al.  Structure of the transition state in the folding process of human procarboxypeptidase A2 activation domain. , 1998, Journal of molecular biology.

[15]  Luis Serrano,et al.  The folding transition state between SH3 domains is conformationally restricted and evolutionarily conserved , 1999, Nature Structural Biology.

[16]  D. Covell Folding protein α‐carbon chains into compact forms by monte carlo methods , 1992 .

[17]  Michele Parrinello,et al.  A self-learning algorithm for biased molecular dynamics , 2010, Proceedings of the National Academy of Sciences.

[18]  V. Muñoz,et al.  Submillisecond kinetics of protein folding. , 1997, Current opinion in structural biology.

[19]  R. Li,et al.  The hydrogen exchange core and protein folding , 1999, Protein science : a publication of the Protein Society.

[20]  C. Dobson,et al.  Protein misfolding, functional amyloid, and human disease. , 2006, Annual review of biochemistry.

[21]  Rémi Munos,et al.  Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.

[22]  Donald J. Jacobs,et al.  Generic rigidity in three-dimensional bond-bending networks , 1998 .

[23]  Luis Serrano,et al.  Different folding transition states may result in the same native structure , 1996, Nature Structural Biology.

[24]  Dmitry Berenson,et al.  A robot path planning framework that learns from experience , 2012, 2012 IEEE International Conference on Robotics and Automation.

[25]  Qian Yi,et al.  Direct evidence for a two‐state protein unfolding transition from hydrogen‐deuterium exchange, mass spectrometry, and NMR , 1996, Protein science : a publication of the Protein Society.

[26]  David Baker,et al.  Experiment and theory highlight role of native state topology in SH3 folding , 1999, Nature Structural Biology.

[27]  Vijay S. Pande,et al.  Folding@home: Lessons from eight years of volunteer distributed computing , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[28]  M. Levitt Protein folding by restrained energy minimization and molecular dynamics. , 1983, Journal of molecular biology.

[29]  Nancy M. Amato,et al.  A Path Planning-Based Study of Protein Folding with a Case Study of Hairpin Formation in Protein G and L , 2002, Pacific Symposium on Biocomputing.

[30]  A. Gronenborn,et al.  Fast folding of a prototypic polypeptide: The immunoglobulin binding domain of streptococcal protein G , 1994, Protein science : a publication of the Protein Society.

[31]  F M Poulsen,et al.  Formation of hydrogen bonds precedes the rate-limiting formation of persistent structure in the folding of ACBP. , 2000, Journal of molecular biology.

[32]  David Baker,et al.  Computer-based redesign of a protein folding pathway , 2001, Nature Structural Biology.

[33]  Zvi Kelman Isotope labeling of biomolecules : applications , 2016 .

[34]  Taeho Jo,et al.  Improving Protein Fold Recognition by Deep Learning Networks , 2015, Scientific Reports.

[35]  Andrew W. Moore,et al.  An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[36]  T. Siméon,et al.  Modeling protein conformational transitions by a combination of coarse-grained normal mode analysis and robotics-inspired methods , 2013, BMC Structural Biology.

[37]  S. A. Jacobs,et al.  Local randomization in neighbor selection improves PRM roadmap quality , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[38]  B. Faverjon,et al.  Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[39]  E. Kolehmainen NMR spectroscopy: Basic principles, concepts, and applications in chemistry , 1996 .

[40]  Carole A. Goble,et al.  Distilling structure in Taverna scientific workflows: a refactoring approach , 2014, BMC Bioinformatics.

[41]  H. Günther,et al.  NMR Spectroscopy: Basic Principles, Concepts and Applications in Chemistry , 2013 .

[42]  Nancy M. Amato,et al.  RESAMPL: A Region-Sensitive Adaptive Motion Planner , 2008, WAFR.

[43]  M. Levitt A simplified representation of protein conformations for rapid simulation of protein folding. , 1976, Journal of molecular biology.

[44]  A. Fersht,et al.  The folding of an enzyme. V. H/2H exchange-nuclear magnetic resonance studies on the folding pathway of barnase: complementarity to and agreement with protein engineering studies. , 1992, Journal of molecular biology.

[45]  V. Muñoz,et al.  A statistical mechanical model for β-hairpin kinetics , 1998 .

[46]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[47]  Nancy M. Amato,et al.  Improved roadmap connection via local learning for sampling based planners , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[48]  David Baker,et al.  Important role of hydrogen bonds in the structurally polarized transition state for folding of the src SH3 domain , 1998, Nature Structural &Molecular Biology.

[49]  Linxi Zhang,et al.  Folding rate prediction based on neural network model , 2003 .

[50]  Nancy M. Amato,et al.  Using motion planning to study protein folding pathways , 2001, J. Comput. Biol..

[51]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[52]  Jeffrey K. Uhlmann,et al.  Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..

[53]  Cecilia Clementi,et al.  Minimalist protein model as a diagnostic tool for misfolding and aggregation. , 2006, Journal of molecular biology.

[54]  A. Fersht,et al.  Structure of the hydrophobic core in the transition state for folding of chymotrypsin inhibitor 2: a critical test of the protein engineering method of analysis. , 1993, Biochemistry.

[55]  A. Bax,et al.  Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks , 2013, Journal of Biomolecular NMR.

[56]  Thierry Siméon,et al.  Geometric algorithms for the conformational analysis of long protein loops , 2004, J. Comput. Chem..

[57]  Taeho Jo,et al.  Improving protein fold recognition by random forest , 2014, BMC Bioinformatics.

[58]  Nancy M. Amato,et al.  Adaptive Neighbor Connection Aids Protein Motion Modeling , 2014 .

[59]  Gildardo Sánchez-Ante,et al.  Hybrid PRM Sampling with a Cost-Sensitive Adaptive Strategy , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[60]  Caroline Louis-Jeune,et al.  Prediction of protein secondary structure from circular dichroism using theoretically derived spectra , 2012, Proteins.

[61]  Lydia Tapia,et al.  A Machine Learning Approach for Feature-Sensitive Motion Planning , 2004, WAFR.