Investigating the Formation of Structural Elements in Proteins Using Local Sequence-Dependent Information and a Heuristic Search Algorithm

Structural elements inserted in proteins are essential to define folding/unfolding mechanisms and partner recognition events governing signaling processes in living organisms. Here, we present an original approach to model the folding mechanism of these structural elements. Our approach is based on the exploitation of local, sequence-dependent structural information encoded in a database of three-residue fragments extracted from a large set of high-resolution experimentally determined protein structures. The computation of conformational transitions leading to the formation of the structural elements is formulated as a discrete path search problem using this database. To solve this problem, we propose a heuristically-guided depth-first search algorithm. The domain-dependent heuristic function aims at minimizing the length of the path in terms of angular distances, while maximizing the local density of the intermediate states, which is related to their probability of existence. We have applied the strategy to two small synthetic polypeptides mimicking two common structural motifs in proteins. The folding mechanisms extracted are very similar to those obtained when using traditional, computationally expensive approaches. These results show that the proposed approach, thanks to its simplicity and computational efficiency, is a promising research direction.

[1]  C. Dobson,et al.  The amyloid state and its association with protein misfolding diseases , 2014, Nature Reviews Molecular Cell Biology.

[2]  Amarda Shehu,et al.  A General, Adaptive, Roadmap-Based Algorithm for Protein Motion Computation , 2016, IEEE Transactions on NanoBioscience.

[3]  M. Madan Babu,et al.  A million peptide motifs for the molecular biologist. , 2014, Molecular cell.

[4]  Martin Blackledge,et al.  Direct prediction of NMR residual dipolar couplings from the primary sequence of unfolded proteins. , 2013, Angewandte Chemie.

[5]  Juan Cortés,et al.  Realistic Ensemble Models of Intrinsically Disordered Proteins Using a Structure-Encoding Coil Database. , 2019, Structure.

[6]  Harold A. Scheraga,et al.  Conformational Analysis of Macromolecules. III. Helical Structures of Polyglycine and Poly‐L‐Alanine , 1966 .

[7]  Erion Plaku,et al.  A Survey of Computational Treatments of Biomolecules by Robotics-Inspired Methods Modeling Equilibrium Structure and Dynamic , 2016, J. Artif. Intell. Res..

[8]  R. Dror,et al.  How Fast-Folding Proteins Fold , 2011, Science.

[9]  Michele Vendruscolo,et al.  Protein folding and misfolding: a paradigm of self–assembly and regulation in complex biological systems , 2003, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[10]  What is Paradoxical about Levinthal Paradox? , 2002, Journal of biomolecular structure & dynamics.

[11]  Didier Devaurs,et al.  Characterizing Energy Landscapes of Peptides Using a Combination of Stochastic Algorithms , 2015, IEEE Transactions on NanoBioscience.

[12]  Hilla Peretz,et al.  The , 1966 .

[13]  Lydia E Kavraki,et al.  Computational models of protein kinematics and dynamics: beyond simulation. , 2012, Annual review of analytical chemistry.

[14]  Pau Bernadó,et al.  A structural model for unfolded proteins from residual dipolar couplings and small-angle x-ray scattering. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[15]  J. Onuchic,et al.  Navigating the folding routes , 1995, Science.

[16]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[17]  L. Lai,et al.  De Novo Design of a βαβ Motif , 2009 .

[18]  Abhishek K. Jha,et al.  Statistical coil model of the unfolded state: resolving the reconciliation problem. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Kentaro Shimizu,et al.  Folding free‐energy landscape of a 10‐residue mini‐protein, chignolin , 2006, FEBS letters.

[20]  Stefan Edelkamp,et al.  Automated Planning: Theory and Practice , 2007, Künstliche Intell..

[21]  Søren Enemark,et al.  β-hairpin forms by rolling up from C-terminal: Topological guidance of early folding dynamics , 2012, Scientific Reports.

[22]  Thierry Siméon,et al.  Motion planning algorithms for molecular simulations: A survey , 2012, Comput. Sci. Rev..

[23]  Susan Lindquist,et al.  Mechanisms of protein-folding diseases at a glance , 2014, Disease Models & Mechanisms.

[24]  Monika Fuxreiter,et al.  Interactions via intrinsically disordered regions: What kind of motifs? , 2012, IUBMB life.

[25]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[26]  R. Best Atomistic molecular simulations of protein folding. , 2012, Current opinion in structural biology.

[27]  Tom Lenaerts,et al.  Reconstruction of Protein Backbones from the BriX Collection of Canonical Protein Fragments , 2008, PLoS Comput. Biol..

[28]  Pierre Tufféry,et al.  A fast method for large‐scale De Novo peptide and miniprotein structure prediction , 2009, J. Comput. Chem..

[29]  R. Best,et al.  Force-field dependence of chignolin folding and misfolding: comparison with experiment and redesign. , 2012, Biophysical journal.

[30]  Juan Cortés,et al.  Rigid‐CLL: Avoiding constant‐distance computations in cell linked‐lists algorithms , 2012, J. Comput. Chem..

[31]  Peter Tompa,et al.  Intrinsically disordered proteins: emerging interaction specialists. , 2015, Current opinion in structural biology.

[32]  Shinya Honda,et al.  10 residue folded peptide designed by segment statistics. , 2004, Structure.

[33]  A. Bondi van der Waals Volumes and Radii , 1964 .

[34]  M.G.B. Drew,et al.  The art of molecular dynamics simulation , 1996 .

[35]  V. Pande,et al.  The Trp cage: folding kinetics and unfolded state topology via molecular dynamics simulations. , 2002, Journal of the American Chemical Society.

[36]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[37]  G. Rose,et al.  A backbone-based theory of protein folding , 2006, Proceedings of the National Academy of Sciences.

[38]  J M Thornton,et al.  Analysis of main chain torsion angles in proteins: prediction of NMR coupling constants for native and random coil conformations. , 1996, Journal of molecular biology.

[39]  M. Levitt,et al.  Small libraries of protein fragments model native protein structures accurately. , 2002, Journal of molecular biology.

[40]  C. Levinthal How to fold graciously , 1969 .

[41]  M. Levitt A simplified representation of protein conformations for rapid simulation of protein folding. , 1976, Journal of molecular biology.

[42]  C. Dobson Protein folding and misfolding , 2003, Nature.

[43]  Silvia Richter,et al.  The LAMA Planner: Guiding Cost-Based Anytime Planning with Landmarks , 2010, J. Artif. Intell. Res..

[44]  R. Baldwin Protein folding. Matching speed and stability. , 1994, Nature.

[45]  Steven E. Brenner,et al.  SCOPe: Structural Classification of Proteins—extended, integrating SCOP and ASTRAL data and classification of new structures , 2013, Nucleic Acids Res..

[46]  Roland L. Dunbrack Rotamer libraries in the 21st century. , 2002, Current opinion in structural biology.

[47]  Julius Jellinek,et al.  Energy Landscapes: With Applications to Clusters, Biomolecules and Glasses , 2005 .