Statistics of Knots, Geometry of Conformations, and Evolution of Proteins

Like shoelaces, the backbones of proteins may get entangled and form knots. However, only a few knots in native proteins have been identified so far. To more quantitatively assess the rarity of knots in proteins, we make an explicit comparison between the knotting probabilities in native proteins and in random compact loops. We identify knots in proteins statistically, applying the mathematics of knot invariants to the loops obtained by complementing the protein backbone with an ensemble of random closures, and assigning a certain knot type to a given protein if and only if this knot dominates the closure statistics (which tells us that the knot is determined by the protein and not by a particular method of closure). We also examine the local fractal or geometrical properties of proteins via computational measurements of the end-to-end distance and the degree of interpenetration of its subchains. Although we did identify some rather complex knots, we show that native conformations of proteins have statistically fewer knots than random compact loops, and that the local geometrical properties, such as the crumpled character of the conformations at a certain range of scales, are consistent with the rarity of knots. From these, we may conclude that the known “protein universe” (set of native conformations) avoids knots. However, the precise reason for this is unknown—for instance, if knots were removed by evolution due to their unfavorable effect on protein folding or function or due to some other unidentified property of protein evolution.

[1]  A. Grosberg,et al.  Topologically driven swelling of a polymer loop. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Jayanth R Banavar,et al.  Proteins and polymers. , 2005, The Journal of chemical physics.

[3]  James M. Caruthers,et al.  A combinatorial algorithm for effective generation of long maximally compact lattice chains , 1995 .

[4]  Oleg Viro,et al.  Gauss Diagram Formulas for Vassiliev Invariants , 1994 .

[5]  J. Dubochet,et al.  Tightness of random knotting. , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[6]  Arteca Scaling regimes of molecular size and self-entanglements in very compact proteins. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[7]  Marc L. Mansfield,et al.  Knots in Hamilton Cycles , 1994 .

[8]  Marc L. Mansfield,et al.  Fit to be tied , 1997, Nature Structural Biology.

[9]  P. Gennes Scaling Concepts in Polymer Physics , 1979 .

[10]  G. Rose,et al.  Reassessing random-coil statistics in unfolded proteins. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Eugene I Shakhnovich,et al.  Expanding protein universe and its origin from the biological Big Bang , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[12]  H. Stanley,et al.  Statistical physics of macromolecules , 1995 .

[13]  W. Taylor,et al.  PROTEIN FOLDS, KNOTS AND TANGLES , 2008 .

[14]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[15]  Gustavo A. Arteca,et al.  SELF-SIMILARITY IN ENTANGLEMENT COMPLEXITY ALONG THE BACKBONES OF COMPACT PROTEINS , 1997 .

[16]  Peter Virnau,et al.  Knots in globule and coil phases of a model polyethylene. , 2005, Journal of the American Chemical Society.

[17]  T M Birshtein,et al.  Selection of the parameters of intramolecular interactions on the basis of analyzing potential maps and conformations of polypeptides. , 1974, Molecular biology.

[18]  Akos Dobay,et al.  Linear Random Knots and Their Scaling Behavior , 2005 .

[19]  E. Trifonov,et al.  Closed loops of nearly standard size: common basic element of protein structure , 2000, FEBS letters.

[20]  V. Pande,et al.  Foldamer dynamics expressed via Markov state models. II. State space decomposition. , 2005, The Journal of chemical physics.

[21]  J. Skolnick,et al.  On the origin and highly likely completeness of single-domain protein structures. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Eytan Domany,et al.  Automated assignment of SCOP and CATH protein structure classifications from FSSP scores , 2002, Proteins.

[23]  T. Gregory Dewey,et al.  Protein structure and polymer collapse , 1993 .

[24]  Amos Maritan,et al.  Colloquium: Geometrical approach to protein folding: a tube picture , 2003 .

[25]  Marc L. Mansfield,et al.  Are there knots in proteins? , 1994, Nature Structural Biology.

[26]  P. Pierański,et al.  IN SEARCH OF IDEAL KNOTS , 1998 .

[27]  Chris Sander,et al.  Dali/FSSP classification of three-dimensional protein folds , 1997, Nucleic Acids Res..

[28]  Kenneth C. Millett,et al.  TYING DOWN OPEN KNOTS: A STATISTICAL METHOD FOR IDENTIFYING OPEN KNOTS WITH APPLICATIONS TO PROTEINS , 2005 .

[29]  Rhonald Lua,et al.  Fractal and statistical properties of large compact polymers: a computational study , 2003 .

[30]  R. Lua,et al.  Under-knotted and Over-knotted Polymers: Unrestricted Loops , 2004 .

[31]  Vijay S Pande,et al.  Length dependent folding kinetics of phenylacetylene oligomers: structural characterization of a kinetic trap. , 2005, The Journal of chemical physics.

[32]  William R. Taylor,et al.  Protein knots: A tangled problem , 2003, Nature.

[33]  Igor N Berezovsky,et al.  Loop Fold Structure of Proteins: Resolution of Levinthal's Paradox , 2002, Journal of biomolecular structure & dynamics.

[34]  Tim J. P. Hubbard,et al.  SCOP: a structural classification of proteins database , 1998, Nucleic Acids Res..

[35]  Dirk C. Mattfeld,et al.  A Computational Study , 1996 .

[36]  Noguchi,et al.  Parallel Protein Information Analysis (PAPIA) System Running on a 64-Node PC Cluster. , 1998, Genome informatics. Workshop on Genome Informatics.

[37]  William R. Taylor,et al.  A deeply knotted protein structure and how it might fold , 2000, Nature.

[38]  Arteca Scaling behavior of some molecular shape descriptors of polymer chains and protein backbones. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[39]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[40]  Vijay S Pande,et al.  Foldamer simulations: novel computational methods and applications to poly-phenylacetylene oligomers. , 2004, The Journal of chemical physics.

[41]  Vijay S Pande,et al.  Foldamer dynamics expressed via Markov state models. I. Explicit solvent molecular-dynamics simulations in acetonitrile, chloroform, methanol, and water. , 2005, The Journal of chemical physics.

[42]  Arteca Different molecular size scaling regimes for inner and outer regions of proteins. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[43]  N. P. Brown,et al.  Protein structure: geometry, topology and classification , 2001 .

[44]  E. Shakhnovich,et al.  The role of topological constraints in the kinetics of collapse of macromolecules , 1988 .