Representing and comparing protein structures as paths in three-dimensional space

BackgroundMost existing formulations of protein structure comparison are based on detailed atomic level descriptions of protein structures and bypass potential insights that arise from a higher-level abstraction.ResultsWe propose a structure comparison approach based on a simplified representation of proteins that describes its three-dimensional path by local curvature along the generalized backbone of the polypeptide. We have implemented a dynamic programming procedure that aligns curvatures of proteins by optimizing a defined sum turning angle deviation measure.ConclusionAlthough our procedure does not directly optimize global structural similarity as measured by RMSD, our benchmarking results indicate that it can surprisingly well recover the structural similarity defined by structure classification databases and traditional structure alignment programs. In addition, our program can recognize similarities between structures with extensive conformation changes that are beyond the ability of traditional structure alignment programs. We demonstrate the applications of procedure to several contexts of structure comparison. An implementation of our procedure, CURVE, is available as a public webserver.

[1]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[2]  Adam Godzik,et al.  Search for a New Description of Protein Topology and Local Structure , 2000, ISMB.

[3]  A Kolinski,et al.  A method for the prediction of surface “U”‐turns and transglobular connections in small proteins , 1997, Proteins.

[4]  Yuan-Fang Wang,et al.  Protein Structure Alignment and Fast Similarity Search Using Local Shape Signatures , 2004, J. Bioinform. Comput. Biol..

[5]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[6]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[7]  Liisa Holm,et al.  Identification of homology in protein structure classification , 2001, Nature Structural Biology.

[8]  Angela M Gronenborn,et al.  A captured folding intermediate involved in dimerization and domain-swapping of GB1. , 2004, Journal of molecular biology.

[9]  M. Waterman,et al.  Phase transitions in sequence matches and nucleic acid structure. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[10]  John P. Overington,et al.  HOMSTRAD: A database of protein structure alignments for homologous families , 1998, Protein science : a publication of the Protein Society.

[11]  Haipeng Gong,et al.  Does secondary structure determine tertiary structure in proteins? , 2005, Proteins.

[12]  Frances M. G. Pearl,et al.  Recognizing the fold of a protein structure , 2003, Bioinform..

[13]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[14]  H. Wolfson,et al.  Flexible protein alignment and hinge detection , 2002, Proteins.

[15]  K Henrick,et al.  Electronic Reprint Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions , 2022 .

[16]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[17]  Leszek Rychlewski,et al.  FFAS03: a server for profile–profile sequence alignments , 2005, Nucleic Acids Res..

[18]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[19]  M G Rossmann,et al.  Comparison of super-secondary structures in proteins. , 1973, Journal of molecular biology.

[20]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[21]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[22]  W R Taylor,et al.  Protein structural domain identification. , 1999, Protein engineering.

[23]  Roland L Dunbrack,et al.  Assessment of fold recognition predictions in CASP6 , 2005, Proteins.

[24]  Frances M. G. Pearl,et al.  The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis , 2004, Nucleic Acids Res..

[25]  G. Kleywegt Use of non-crystallographic symmetry in protein structure refinement. , 1996, Acta crystallographica. Section D, Biological crystallography.

[26]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[27]  A. Godzik The structural alignment between two proteins: Is there a unique answer? , 1996, Protein science : a publication of the Protein Society.

[28]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[29]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[30]  S. Bryant,et al.  Threading a database of protein cores , 1995, Proteins.

[31]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[32]  Rachel Kolodny,et al.  Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. , 2005, Journal of molecular biology.

[33]  Adam Godzik,et al.  Flexible Structural Neighborhood—a database of protein structural similarities and alignments , 2005, Nucleic Acids Res..

[34]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[35]  Hans J. Vogel,et al.  Calmodulin’s flexibility allows for promiscuity in its interactions with target proteins and peptides , 2004, Molecular biotechnology.

[36]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[37]  W. Taylor,et al.  Global fold determination from a small number of distance restraints. , 1995, Journal of molecular biology.

[38]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .