A segment of cold shock protein directs the folding of a combinatorial protein.

It has been suggested that protein domains evolved by the non-homologous recombination of building blocks of subdomain size. In earlier work we attempted to recapitulate domain evolution in vitro. We took a polypeptide segment comprising three beta-strands in the monomeric, five-stranded beta-barrel cold shock protein (CspA) of Escherichia coli as a building block. This segment corresponds to a complete exon in homologous eukaryotic proteins and includes residues that nucleate folding in CspA. We recombined this segment at random with fragments of natural proteins and succeeded in generating a range of folded chimaeric proteins. We now present the crystal structure of one such combinatorial protein, 1b11, a 103-residue polypeptide that includes segments from CspA and the S1 domain of the 30S ribosomal subunit of E. coli. The structure reveals a segment-swapped, six-stranded beta-barrel of unique architecture that assembles to a tetramer. Surprisingly, the CspA segment retains its structural identity in 1b11, recapitulating its original fold and deforming the structure of the S1 segment as necessary to complete a barrel. Our work provides structural evidence that (i) random shuffling of nonhomologous polypeptide segments can lead to folded proteins and unique architectures, (ii) many structural features of the segments are retained, and (iii) some segments can act as templates around which the rest of the protein folds.

[1]  R. Weinzierl,et al.  Eukaryotic RNA polymerase subunit RPB8 is a new relative of the OB family , 1998, Nature Structural Biology.

[2]  L. Mirny,et al.  Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. , 1999, Journal of molecular biology.

[3]  A R Panchenko,et al.  Foldons, protein structural modules, and exons. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[4]  R. Nussinov,et al.  Mechanism and evolution of protein dimerization , 1998, Protein science : a publication of the Protein Society.

[5]  P. Anelli,et al.  A new class of gadolinium complexes employed to obtain high-phasing-power heavy-atom derivatives: results from SAD experiments with hen egg-white lysozyme and urate oxidase from Aspergillus flavus. , 2003, Acta crystallographica. Section D, Biological crystallography.

[6]  U. Wagner,et al.  Structure of the molybdate/tungstate binding protein mop from Sporomusa ovata. , 2000, Structure.

[7]  A. Fersht,et al.  The structure of the transition state for folding of chymotrypsin inhibitor 2 analysed by protein engineering methods: evidence for a nucleation-condensation mechanism for protein folding. , 1995, Journal of molecular biology.

[8]  G. Montelione,et al.  Solution NMR structure of the major cold shock protein (CspA) from Escherichia coli: identification of a binding epitope for DNA. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[9]  David Eisenberg,et al.  3D domain swapping: As domains continue to swap , 2002, Protein science : a publication of the Protein Society.

[10]  A. D. Robertson,et al.  Native state EX2 and EX1 hydrogen exchange of Escherichia coli CspA, a small beta-sheet protein. , 2002, Biochemistry.

[11]  A. Fersht,et al.  Is there a unifying mechanism for protein folding? , 2003, Trends in biochemical sciences.

[12]  L. Regan,et al.  Combinatorial approaches to protein stability and structure. , 2004, European journal of biochemistry.

[13]  W. Gilbert,et al.  Intron phase correlations and the evolution of the intron/exon structure of genes. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Shinya Honda,et al.  10 residue folded peptide designed by segment statistics. , 2004, Structure.

[15]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[16]  W. Gilbert,et al.  Intron/exon structure of the chicken pyruvate kinase gene , 1985, Cell.

[17]  G T Montelione,et al.  Solution NMR structure and backbone dynamics of the major cold-shock protein (CspA) from Escherichia coli: evidence for conformational dynamics in the single-stranded RNA-binding site. , 1998, Biochemistry.

[18]  L. H. Bradley,et al.  De novo proteins from designed combinatorial libraries , 2004, Protein science : a publication of the Protein Society.

[19]  C. Chothia,et al.  Evolution of the Protein Repertoire , 2003, Science.

[20]  David T. Jones,et al.  Protein superfamilles and domain superfolds , 1994, Nature.

[21]  A. Gronenborn,et al.  Core mutations switch monomeric protein GB1 into an intertwined tetramer , 2002, Nature Structural Biology.

[22]  W. Ford Doolittle,et al.  Genes in pieces: were they ever together? , 1978, Nature.

[23]  K. Dill,et al.  Denatured states of proteins. , 1991, Annual review of biochemistry.

[24]  Alexei Fedorov,et al.  Introns in gene evolution. , 2003 .

[25]  Wolfgang Kabsch,et al.  Evaluation of Single-Crystal X-ray Diffraction Data from a Position-Sensitive Detector , 1988 .

[26]  S J de Souza,et al.  Origin of genes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[27]  D. Lawson,et al.  Two crystal structures of the cytoplasmic molybdate-binding protein ModG suggest a novel cooperative binding mechanism and provide insights into ligand-binding specificity. , 2001, Journal of molecular biology.

[28]  J. Zou,et al.  Improved methods for building protein models in electron density maps and the location of errors in these models. , 1991, Acta crystallographica. Section A, Foundations of crystallography.

[29]  A. Murzin OB(oligonucleotide/oligosaccharide binding)‐fold: common structural and functional solution for non‐homologous sequences. , 1993, The EMBO journal.

[30]  G. Murshudov,et al.  Refinement of macromolecular structures by the maximum-likelihood method. , 1997, Acta crystallographica. Section D, Biological crystallography.

[31]  A R Panchenko,et al.  The foldon universe: a survey of structural similarity and self-recognition of independently folding units. , 1997, Journal of molecular biology.

[32]  Scott R. Presnell,et al.  Origins of structural diversity within sequentially identical hexapeptides , 1993, Protein science : a publication of the Protein Society.

[33]  A. Skerra Imitating the humoral immune response. , 2003, Current opinion in chemical biology.

[34]  K. Isono,et al.  Primary structure of Escherichia coli ribosomal protein S1 and of its gene rpsA. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[35]  R Miller,et al.  Optimizing Shake-and-Bake for proteins. , 1999, Acta crystallographica. Section D, Biological crystallography.

[36]  P Argos,et al.  Analysis of sequence-similar pentapeptides in unrelated protein tertiary structures. Strategies for protein folding and a guide for site-directed mutagenesis. , 1987, Journal of molecular biology.

[37]  U Heinemann,et al.  Crystal structure of CspA, the major cold shock protein of Escherichia coli. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[38]  S. Sudarsanam,et al.  Structural diversity of sequentially identical subsequences of proteins: Identical octapeptides can have different conformations , 1998, Proteins.

[39]  M. Gerstein,et al.  Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model. , 2001, Journal of molecular biology.

[40]  S. Marqusee,et al.  The kinetic folding intermediate of ribonuclease H resembles the acid molten globule and partially unfolded molecules detected under native conditions , 1997, Nature Structural Biology.

[41]  P E Wright,et al.  Formation of a molten globule intermediate early in the kinetic folding pathway of apomyoglobin. , 1993, Science.

[42]  L. Wu,et al.  Autonomous protein folding units. , 2000, Advances in protein chemistry.

[43]  Richard J Morris,et al.  ARP/wARP and automatic interpretation of protein electron density maps. , 2003, Methods in enzymology.

[44]  Mark Proctor,et al.  The Solution Structure of the S1 RNA Binding Domain: A Member of an Ancient Nucleic Acid–Binding Fold , 1997, Cell.

[45]  M. Inouye,et al.  CspA, the Major Cold-shock Protein of Escherichia coli, Is an RNA Chaperone* , 1997, The Journal of Biological Chemistry.

[46]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[47]  L. Gregoret,et al.  Role of a solvent‐exposed aromatic cluster in the folding of Escherichia coli CspA , 2000, Protein science : a publication of the Protein Society.

[48]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[49]  C. Mant,et al.  Structural cassette mutagenesis in a de novo designed protein: proof of a novel concept for examining protein folding and stability. , 1998, Biopolymers.

[50]  G. Winter,et al.  Novel folded protein domains generated by combinatorial shuffling of polypeptide segments. , 2000, Proceedings of the National Academy of Sciences of the United States of America.