Convergent evolution in structural elements of proteins investigated using cross profile analysis

BackgroundEvolutionary relations of similar segments shared by different protein folds remain controversial, even though many examples of such segments have been found. To date, several methods such as those based on the results of structure comparisons, sequence-based classifications, and sequence-based profile-profile comparisons have been applied to identify such protein segments that possess local similarities in both sequence and structure across protein folds. However, to capture more precise sequence-structure relations, no method reported to date combines structure-based profiles, and sequence-based profiles based on evolutionary information. The former are generally regarded as representing the amino acid preferences at each position of a specific conformation of protein segment. They might reflect the nature of ancient short peptide ancestors, using the results of structural classifications of protein segments.ResultsThis report describes the development and use of "Cross Profile Analysis" to compare sequence-based profiles and structure-based profiles based on amino acid occurrences at each position within a protein segment cluster. Using systematic cross profile analysis, we found structural clusters of 9-residue and 15-residue segments showing remarkably strong correlation with particular sequence profiles. These correlations reflect structural similarities among constituent segments of both sequence-based and structure-based profiles. We also report previously undetectable sequence-structure patterns that transcend protein family and fold boundaries, and present results of the conformational analysis of the deduced peptide of a segment cluster. These results suggest the existence of ancient short-peptide ancestors.ConclusionsCross profile analysis reveals the polyphyletic and convergent evolution of β-hairpin-like structures, which were verified both experimentally and computationally. The results presented here give us new insights into the evolution of short protein segments.

[1]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[2]  W R Taylor,et al.  Pattern matching methods in protein sequence comparison and structure prediction. , 1988, Protein engineering.

[3]  Anna R Panchenko,et al.  Finding weak similarities between proteins by sequence profile comparison. , 2003, Nucleic acids research.

[4]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[5]  K Shiba,et al.  Creation of libraries with long open reading frames by polymerization of a microgene , 1997 .

[6]  D. Baker,et al.  Prediction of local structure in proteins using a library of sequence-structure motifs. , 1998, Journal of molecular biology.

[7]  J. Hartwig,et al.  Actin-binding proteins. , 1991, Current opinion in cell biology.

[8]  L. Serrano,et al.  A short linear peptide that folds into a native stable β-hairpin in aqueous solution , 1994, Nature Structural Biology.

[9]  T. Noda,et al.  Creation of libraries with long ORFs by polymerization of a microgene. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[10]  P. Schimmel,et al.  Functional assembly of a randomly cleaved protein. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J H Hartwig,et al.  Actin-binding proteins 1: spectrin superfamily. , 1994, Protein profile.

[12]  M. Palumbo,et al.  Patterns, structures, and amino acid frequencies in structural building blocks, a protein secondary structure classification scheme , 1997, Proteins.

[13]  Roland L. Dunbrack Sequence comparison and protein structure prediction. , 2006, Current opinion in structural biology.

[14]  Takatsugu Hirokawa,et al.  Protein structure prediction using a variety of profile libraries and 3D verification , 2005, Proteins.

[15]  D. Theobald,et al.  Divergent evolution within protein superfolds inferred from profile-based phylogenetics. , 2005, Journal of molecular biology.

[16]  K. Ikeda,et al.  Visualization of conformational distribution of short to medium size segments in globular proteins and identification of local structural motifs , 2005, Protein science : a publication of the Protein Society.

[17]  A Maritan,et al.  Recurrent oligomers in proteins: An optimal scheme reconciling accurate and concise backbone representations in automated folding and design studies , 2000, Proteins.

[18]  J. E. Brown,et al.  Helix-coil transition of the isolated amino terminus of ribonuclease. , 1971, Biochemistry.

[19]  J. Söding,et al.  More than the sum of their parts: On the evolution of proteins from peptides , 2003, BioEssays : news and reviews in molecular, cellular and developmental biology.

[20]  Patrice Koehl,et al.  ASTRAL compendium enhancements , 2002, Nucleic Acids Res..

[21]  S. Honda,et al.  Structural diversity of protein segments follows a power-law distribution. , 2006, Biophysical journal.

[22]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[23]  Leszek Rychlewski,et al.  FFAS03: a server for profile–profile sequence alignments , 2005, Nucleic Acids Res..

[24]  Y. Kuroda,et al.  Residual helical structure in the C-terminal fragment of cytochrome c. , 1993, Biochemistry.

[25]  Arnaud Ducruix,et al.  Anisotropic behaviour of the C-terminal Kunitz-type domain of the alpha3 chain of human type VI collagen at atomic resolution (0.9 A). , 2002, Acta crystallographica. Section D, Biological crystallography.

[26]  D Baker,et al.  Global properties of the mapping between local amino acid sequence and local structure in proteins. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[27]  R. Woody,et al.  Contributions of tryptophan side chains to the circular dichroism of globular proteins: exciton couplets and coupled oscillators. , 1994, Faraday discussions.

[28]  N. Grishin,et al.  Reconstruction of ancestral protein sequences and its applications , 2004, BMC Evolutionary Biology.

[29]  G. Winter,et al.  Novel folded protein domains generated by combinatorial shuffling of polypeptide segments. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Frances M. G. Pearl,et al.  The CATH extended protein‐family database: Providing structural annotations for genome sequences , 2002, Protein science : a publication of the Protein Society.

[31]  Dudley H. Williams,et al.  Structural characterization of a mutant peptide derived from ubiquitin: Implications for protein folding , 2000, Protein science : a publication of the Protein Society.

[32]  W. Stemmer,et al.  DNA shuffling of a family of genes from diverse species accelerates directed evolution , 1998, Nature.

[33]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[34]  Yutaka Akiyama,et al.  Structure of the N-terminal Domain of PEX1 AAA-ATPase , 2004, Journal of Biological Chemistry.

[35]  Paul Young,et al.  The spectrin repeat: a structural platform for cytoskeletal protein assemblies , 2002, FEBS letters.

[36]  John A. Richards,et al.  Remote Sensing Digital Image Analysis , 1986 .

[37]  Shinya Honda,et al.  10 residue folded peptide designed by segment statistics. , 2004, Structure.

[38]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[39]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[40]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[41]  C. Ponting,et al.  On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world? , 2001, Journal of structural biology.

[42]  Harald Huber,et al.  Structure of adenylylsulfate reductase from the hyperthermophilic Archaeoglobus fulgidus at 1.6-Å resolution , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[43]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[44]  Yutaka Akiyama,et al.  Structure of the N-terminal domain of PEX1 AAA-ATPase. Characterization of a putative adaptor-binding domain. , 2005, The Journal of biological chemistry.

[45]  Arne Elofsson,et al.  Profile–profile methods provide improved fold‐recognition: A study of different profile–profile alignment methods , 2004, Proteins.

[46]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[47]  Nick V Grishin,et al.  Combining evolutionary and structural information for local protein structure prediction , 2004, Proteins.

[48]  Alejandro A. Schäffer,et al.  IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices , 1999, Bioinform..

[49]  BMC Bioinformatics , 2005 .

[50]  M Kinoshita,et al.  A mini-protein designed by removing a module from barnase: molecular modeling and NMR measurements of the conformation. , 1999, Protein engineering.

[51]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[52]  A. Godzik,et al.  Comparison of sequence profiles. Strategies for structural predictions using sequence information , 2008, Protein science : a publication of the Protein Society.

[53]  M Go,et al.  Protein anatomy: functional roles of barnase module. , 1993, The Journal of biological chemistry.

[54]  Ronald M Levy,et al.  Have we seen all structures corresponding to short protein fragments in the Protein Data Bank? An update. , 2003, Protein engineering.

[55]  Shinya Honda,et al.  ProSeg: a database of local structures of protein segments , 2009, J. Comput. Aided Mol. Des..

[56]  Yutaka Akiyama,et al.  FORTE: a profile-profile comparison tool for protein fold recognition , 2004, Bioinform..

[57]  J. Söding,et al.  Evolution of outer membrane beta-barrels from an ancestral beta beta hairpin. , 2010, Molecular biology and evolution.

[58]  W. Stemmer,et al.  Improved Green Fluorescent Protein by Molecular Evolution Using DNA Shuffling , 1996, Nature Biotechnology.

[59]  Lei Xie,et al.  Detecting evolutionary relationships across existing fold space, using sequence order-independent profile–profile alignments , 2008, Proceedings of the National Academy of Sciences.

[60]  Shinya Honda,et al.  Crystal structure of a ten-amino acid protein. , 2008, Journal of the American Chemical Society.

[61]  S. Honda,et al.  Thermodynamics of a beta-hairpin structure: evidence for cooperative formation of folding nucleus. , 2000, Journal of molecular biology.

[62]  Olgun Guvench,et al.  Tryptophan side chain electrostatic interactions determine edge-to-face vs parallel-displaced tryptophan side chain geometries in the designed beta-hairpin "trpzip2". , 2005, Journal of the American Chemical Society.

[63]  Adam Godzik,et al.  Connecting the protein structure universe by using sparse recurring fragments. , 2005, Structure.

[64]  An-Suei Yang,et al.  Local structure-based sequence profile database for local and global protein structure predictions , 2002, Bioinform..

[65]  Serge A. Hazout,et al.  Local backbone structure prediction of proteins , 2004, Silico Biol..