A simple method of identifying symmetric substructures of proteins

Accurate identifications of internal symmetric substructures of proteins are needed in protein evolution study and protein design. To overcome the difficulties met by previous methods, here we propose a simple quantitative one by using a similarity matrix plus Pearson's correlation analysis. The distance root-mean-square deviation (dRMSD) is used to measure the similarity of two substructures in a protein. We applied this method to the proteins of the beta-propeller, jelly roll, and beta-trefoil families and the results show that this method cannot only detect the internal repetitive structures in proteins effectively, but also can identify their locations easily.

[1]  W. Taylor Protein structure comparison using iterated double dynamic programming , 2008, Protein science : a publication of the Protein Society.

[2]  Andrzej K Konopka,et al.  Sequence Complexity and Composition , 2005 .

[3]  Omkar Mate Protein Structure Alignment Protein Structure Alignment , 2006 .

[4]  Andrzej K. Konopka,et al.  DISTAN--a program which detects significant distances between short oligonucleotides , 1987, Comput. Appl. Biosci..

[5]  A K Konopka,et al.  Distance analysis and sequence properties of functional domains in nucleic acids and proteins. , 1988, Gene analysis techniques.

[6]  T. P. Flores,et al.  Comparison of conformational characteristics in structurally similar protein pairs , 1993, Protein science : a publication of the Protein Society.

[7]  M Wilmanns,et al.  Structural evidence for evolution of the beta/alpha barrel scaffold by gene duplication and fusion. , 2000, Science.

[8]  H. Wolfson,et al.  An efficient automated computer vision based technique for detection of three dimensional structural motifs in proteins. , 1992, Journal of biomolecular structure & dynamics.

[9]  C. Sander,et al.  Detection of common three‐dimensional substructures in proteins , 1991, Proteins.

[10]  Ming-Jing Hwang,et al.  Alternative alignments from comparison of protein structures , 2004, Proteins.

[11]  L. Hood,et al.  Gene families: the taxonomy of protein paralogs and chimeras. , 1997, Science.

[12]  A. Valencia,et al.  Beta-propellers: associated functions and their role in human diseases. , 2003, Current medicinal chemistry.

[13]  D. Westhead,et al.  Sequence relationships in the legume lectin fold and other jelly rolls. , 2002, Protein engineering.

[14]  C. Chothia,et al.  Evolution of the Protein Repertoire , 2003, Science.

[15]  William R Taylor,et al.  A Fourier analysis of symmetry in protein structure. , 2002, Protein engineering.

[16]  N. Go,et al.  Common spatial arrangements of backbone fragments in homologous and non-homologous proteins. , 1992, Journal of molecular biology.

[17]  M. T. Barakat,et al.  Molecular structure matching by simulated annealing. III. The incorporation of null correspondences into the matching problem , 1991, J. Comput. Aided Mol. Des..

[18]  Yi Xiao,et al.  A common sequence-associated physicochemical feature for proteins of beta-trefoil family , 2005, Comput. Biol. Chem..

[19]  Andrzej K. Konopka,et al.  Sequences and Codes: Fundamentals of Biomolecular Cryptology , 1994 .

[20]  M. Levitt A simplified representation of protein conformations for rapid simulation of protein folding. , 1976, Journal of molecular biology.

[21]  Birte Höcker,et al.  Dissection of a (βα)8-barrel enzyme into two folded halves , 2001, Nature Structural Biology.

[22]  C. Orengo,et al.  Correlation of observed fold frequency with the occurrence of local structural motifs. , 1999, Journal of molecular biology.

[23]  R. Meyers Encyclopedia of molecular cell biology and molecular medicine , 2014 .

[24]  I. Haneef,et al.  Defining topologigical equivalences in macromolecules , 1991 .

[25]  David Neil Cooper,et al.  Nature encyclopedia of the human genome , 2003 .

[26]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[27]  Douglas W. Smith Biocomputing: informatics and genome projects. , 1994 .

[28]  G M Crippen,et al.  Significance of root-mean-square deviation in comparing three-dimensional structures of globular proteins. , 1994, Journal of molecular biology.

[29]  Andreas Plückthun,et al.  Consensus Design of Repeat Proteins , 2004, Chembiochem : a European journal of chemical biology.

[30]  C. Ponting,et al.  Protein repeats: structures, functions, and evolution. , 2001, Journal of structural biology.

[31]  P Willett,et al.  Use of techniques derived from graph theory to compare secondary structure motifs in proteins. , 1990, Journal of molecular biology.

[32]  Z. Peng,et al.  Consensus-derived structural determinants of the ankyrin repeat motif , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[33]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[34]  Jenny Barna DNA and Protein Sequence Analysis , 1997 .

[35]  A. Mclachlan Three-fold structural pattern in the soybean trypsin inhibitor (Kunitz). , 1979, Journal of molecular biology.

[36]  Yi Xiao,et al.  Hidden symmetries in the primary sequences of beta-barrel family , 2007, Comput. Biol. Chem..

[37]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[38]  M. Sternberg,et al.  On the prediction of protein structure: The significance of the root-mean-square deviation. , 1980, Journal of molecular biology.

[39]  Ming-Jing Hwang,et al.  Protein structure comparison by probability-based matching of secondary structure elements , 2003, Bioinform..

[40]  R. Doolittle The multiplicity of domains in proteins. , 1995, Annual review of biochemistry.

[41]  Matthias Wilmanns,et al.  Structural Evidence for Evolution of the b / a Barrel Scaffold by Gene Duplication and Fusion , 2022 .

[42]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[43]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.