Massive sequence perturbation of a small protein.

Most protein topologies rarely occur in nature, thus limiting our ability to extract sequence information that could be used to predict structure, function, and evolutionary constraints on protein folds. In principle, the sequence diversity explored by a given protein topology could be expanded by introducing sequence perturbations and selecting variant proteins that fold correctly. However, our capacity to explore sequence space is intrinsically limited by the enormous number of sequences generated from the 20 amino acids and the limited number of variants likely to fold. Here we sought to test whether the sequence space for naturally existing proteins can be explored by simple, sequential degeneration of a complete set of short sequence segments of a model protein, without long-range covariation. Using the Raf ras binding domain as a model of a small protein capable of autonomous folding, we degenerated 72 of 76 positions of the primary structure for the 20 amino acids in segments of four to seven residues defined by secondary structure and selected the folded species for interaction with h-ras by using an in vivo survival-selection assay. The methodology presented allowed for rigorous statistical analysis and comparison of sequence diversity. The ensemble of sequence variants of Raf ras binding domain obtained have recaptured the diversity observed for the ubiquitin-roll topology. A signature sequence for this fold and the implication of this strategy to protein design and structure prediction are discussed.

[1]  J. Söding,et al.  More than the sum of their parts: On the evolution of proteins from peptides , 2003, BioEssays : news and reviews in molecular, cellular and developmental biology.

[2]  Core-directed protein design. II. Rescue of a multiply mutated and destabilized variant of ubiquitin. , 1999, Biochemistry.

[3]  David Baker,et al.  Characterization of the folding energy landscapes of computer generated proteins suggests high folding free energy barriers and cooperativity may be consequences of natural selection. , 2004, Journal of molecular biology.

[4]  K. Dill,et al.  Theory for protein mutability and biogenesis. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Kresten Lindorff-Larsen,et al.  Protein folding and the organization of the protein topology universe. , 2005, Trends in biochemical sciences.

[6]  A. Arkin,et al.  An algorithm for protein engineering: simulations of recursive ensemble mutagenesis. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[7]  F. Walker,et al.  Point Mutants of c-Raf-1 RBD with Elevated Binding to v-Ha-Ras* , 2000, The Journal of Biological Chemistry.

[8]  K A Dill,et al.  Are proteins well-packed? , 2001, Biophysical journal.

[9]  J. Drake,et al.  Rates of spontaneous mutation. , 1998, Genetics.

[10]  S. Vishveshwara,et al.  Identification of side-chain clusters in protein structures by a graph spectral method. , 1999, Journal of molecular biology.

[11]  D Baker,et al.  The sequences of small proteins are not extensively optimized for rapid folding by natural selection. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  N. Grishin Fold change in evolution of protein structures. , 2001, Journal of structural biology.

[13]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[14]  B Catimel,et al.  c-Raf-1 RBD associates with a subset of active v-H-Ras. , 2000, Biochemistry.

[15]  G. Waldo,et al.  Genetic screens and directed evolution for protein solubility. , 2003, Current opinion in chemical biology.

[16]  A. Fersht,et al.  Active barnase variants with completely random hydrophobic cores. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[17]  D. Baker,et al.  A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. , 2003, Journal of molecular biology.

[18]  D. Baker,et al.  Functional rapidly folding proteins from simplified amino acid sequences , 1997, Nature Structural Biology.

[19]  R. Sauer,et al.  Combinatorial cassette mutagenesis as a probe of the informational content of protein sequences. , 1988, Science.

[20]  D. Fry,et al.  Solution structure of the Ras-binding domain of c-Raf-1 and identification of its Ras interaction surface. , 1995, Biochemistry.

[21]  D. Baker,et al.  Prospects for ab initio protein structural genomics. , 2001, Journal of molecular biology.

[22]  A. Wittinghofer,et al.  Quantitative structure-activity analysis correlating Ras/Raf interaction in vitro to Raf activation in vivo , 1996, Nature Structural Biology.

[23]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[24]  J. C. Kendrew,et al.  Structure and function of haemoglobin: II. Some relations between polypeptide chain configuration and amino acid sequence , 1965 .

[25]  S. Michnick,et al.  Oligomerization domain-directed reassembly of active dihydrofolate reductase from rationally designed fragments. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[26]  R. Ranganathan,et al.  Evolutionarily conserved pathways of energetic connectivity in protein families. , 1999, Science.

[27]  A. Lesk,et al.  How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins. , 1980, Journal of molecular biology.

[28]  Richard Bonneau,et al.  Improving the performance of rosetta using multiple sequence alignment information and global measures of hydrophobic core formation , 2001, Proteins.

[29]  W. Lim,et al.  Alternative packing arrangements in the hydrophobic core of λrepresser , 1989, Nature.

[30]  D Baker,et al.  Contrasting roles for symmetrically disposed beta-turns in the folding of a small protein. , 1997, Journal of molecular biology.

[31]  S. Michnick,et al.  raf RBD and ubiquitin proteins share similar folds, folding rates and mechanisms despite having unrelated amino acid sequences. , 2004, Biochemistry.