The Evolution and Structure Prediction of Coiled Coils across All Genomes

Coiled coils are α-helical interactions found in many natural proteins. Various sequence-based coiled-coil predictors are available, but key issues remain: oligomeric state and protein-protein interface prediction and extension to all genomes. We present SpiriCoil (http://supfam.org/SUPERFAMILY/spiricoil), which is based on a novel approach to the coiled-coil prediction problem for coiled coils that fall into known superfamilies: hundreds of hidden Markov models representing coiled-coil-containing domain families. Using whole domains gives the advantage that sequences flanking the coiled coils help. SpiriCoil performs at least as well as existing methods at detecting coiled coils and significantly advances the state of the art for oligomer state prediction. SpiriCoil has been run on over 16 million sequences, including all completely sequenced genomes (more than 1200), and a resulting Web interface supplies data downloads, alignments, scores, oligomeric state classifications, three-dimensional homology models and visualisation. This has allowed, for the first time, a genomewide analysis of coiled-coil evolution. We found that coiled coils have arisen independently de novo well over a hundred times, and these are observed in 16 different oligomeric states. Coiled coils in almost all oligomeric states were present in the last universal common ancestor of life. The vast majority of occasions that individual coiled coils have arisen de novo were before the last universal common ancestor of life; we do, however, observe scattered instances throughout subsequent evolutionary history, mostly in the formation of the eukaryote superkingdom. Coiled coils do not change their oligomeric state over evolution and did not evolve from the rearrangement of existing helices in proteins; coiled coils were forged in unison with the fold of the whole protein.

[1]  A. Lupas,et al.  The structure of alpha-helical coiled coils. , 2005, Advances in protein chemistry.

[2]  M. Delorenzi,et al.  An HMM model for coiled-coil domains and a comparison with PSSM-based predictions , 2002, Bioinform..

[3]  I. Meier,et al.  Coiled-coil protein composition of 22 proteomes – differences and common themes in subcellular infrastructure and traffic control , 2005, BMC Evolutionary Biology.

[4]  Andrei N. Lupas,et al.  The structure of α-helical coiled coils , 2005 .

[5]  A. Sali,et al.  Modeller: generation and refinement of homology-based protein structure models. , 2003, Methods in enzymology.

[6]  B. Rost,et al.  Comparing function and structure between entire proteomes , 2001, Protein science : a publication of the Protein Society.

[7]  E. Wolf,et al.  A computationally directed screen identifying interacting coiled coils from Saccharomyces cerevisiae. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Cyrus Chothia,et al.  Genomic and structural aspects of protein evolution. , 2009, The Biochemical journal.

[9]  Johannes Söding,et al.  Comparative analysis of coiled-coil prediction methods. , 2006, Journal of structural biology.

[10]  D. Woolfson,et al.  Predicting oligomerization states of coiled coils , 1995, Protein science : a publication of the Protein Society.

[11]  A. Lupas,et al.  Predicting coiled coils from protein sequences , 1991, Science.

[12]  D. Parry,et al.  Fifty years of coiled-coils and alpha-helical bundles: a close relationship between sequence and structure. , 2008, Journal of structural biology.

[13]  Cyrus Chothia,et al.  The SUPERFAMILY database in 2007: families and functions , 2006, Nucleic Acids Res..

[14]  F. Crick,et al.  The packing of α‐helices: simple coiled‐coils , 1953 .

[15]  B. Berger,et al.  MultiCoil: A program for predicting two‐and three‐stranded coiled coils , 1997, Protein science : a publication of the Protein Society.

[16]  Oliver D. Testa,et al.  CC+: a relational database of coiled-coil structures , 2008, Nucleic Acids Res..

[17]  J Walshaw,et al.  Socket: a program for identifying and analysing coiled-coil motifs within protein structures. , 2001, Journal of molecular biology.

[18]  Cyrus Chothia,et al.  SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments , 2002, Nucleic Acids Res..

[19]  Tim J. P. Hubbard,et al.  Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..

[20]  D. Parry Coiled-coils in α-helix-containing proteins: analysis of the residue types within the heptad repeat and the use of these data in the prediction of coiled-coils in other proteins , 1982, Bioscience reports.

[21]  D. Parry,et al.  Heptad breaks in α‐helical coiled coils: Stutters and stammers , 1996 .

[22]  D. Parry,et al.  Heptad breaks in alpha-helical coiled coils: stutters and stammers. , 1996, Proteins.

[23]  S. J. Deminoff,et al.  Coiled coil structures and transcription: an analysis of the S. cerevisiae coilome , 2007, Molecular Genetics and Genomics.

[24]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[25]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[26]  Elizabeth H C Bromley,et al.  Peptide and protein building blocks for synthetic biology: from programming biomolecules to self-organized biomolecular systems. , 2008, ACS chemical biology.

[27]  D N Woolfson,et al.  Coiled-coil assembly by peptides with non-heptad sequence motifs. , 1997, Folding & design.

[28]  Amy E. Keating,et al.  Paircoil2: improved prediction of coiled coils from sequence , 2006, Bioinform..

[29]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[30]  Derek N Woolfson,et al.  Extended knobs-into-holes packing in classical and complex coiled-coil assemblies. , 2003, Journal of structural biology.

[31]  D. Woolfson,et al.  A periodic table of coiled-coil protein structures. , 2009, Journal of molecular biology.

[32]  Richard Hughey,et al.  Hidden Markov models for detecting remote protein homologies , 1998, Bioinform..

[33]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[34]  C. Chothia,et al.  Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. , 2001, Journal of molecular biology.

[35]  B. Berger,et al.  Predicting coiled coils by use of pairwise residue correlations. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Piero Fariselli,et al.  CCHMM_PROF: a HMM-based coiled-coil predictor with evolutionary information , 2009, Bioinform..