High solubility of random-sequence proteins consisting of five kinds of primitive amino acids.

Searching for functional proteins among random-sequence libraries is a major challenge of protein engineering; the difficulties include the poor solubility of many random-sequence proteins. A library in which most of the polypeptides are soluble and stable would therefore be of great benefit. Although modern proteins consist of 20 amino acids, it has been suggested that early proteins evolved from a reduced alphabet. Here, we have constructed a library of random-sequence proteins consisting of only five amino acids, Ala, Gly, Val, Asp and Glu, which are believed to have been the most abundant in the prebiotic environment. Expression and characterization of arbitrarily chosen proteins in the library indicated that five-alphabet random-sequence proteins have higher solubility than do 20-alphabet random-sequence proteins with a similar level of hydrophobicity. The results support the reduced-alphabet hypothesis of the primordial genetic code and should also be helpful in constructing optimized protein libraries for evolutionary protein engineering.

[1]  John C Chaput,et al.  Evolutionary optimization of a nonbiological ATP binding protein for improved folding stability. , 2004, Chemistry & biology.

[2]  David Baker,et al.  Searching for folded proteins in vitro and in silico. , 2004, European journal of biochemistry.

[3]  M. Walsh,et al.  A novel ADP- and zinc-binding fold from function-directed in vitro evolution , 2004, Nature Structural &Molecular Biology.

[4]  Yoichiro Ito,et al.  Evolution of an Arbitrary Sequence in Solubility , 2004, Journal of Molecular Evolution.

[5]  Hiroshi Yanagawa,et al.  DNA display for in vitro selection of diverse peptide libraries. , 2003, Nucleic acids research.

[6]  Dan S. Tawfik,et al.  Conformational diversity and protein evolution--a 60-year-old hypothesis revisited. , 2003, Trends in biochemical sciences.

[7]  Ke Fan,et al.  What is the minimum number of letters required to fold a protein? , 2003, Journal of molecular biology.

[8]  S. Akanuma,et al.  Combinatorial mutagenesis to restrict amino acid usage in an enzyme to a reduced set , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Mona Singh,et al.  Evolution of amino acid frequencies in proteins over deep time: inferred order of introduction of amino acids into the genetic code. , 2002, Molecular biology and evolution.

[10]  Jeffery G Saven,et al.  Combinatorial protein design. , 2002, Current opinion in structural biology.

[11]  H. Yanagawa,et al.  Random multi-recombinant PCR for the construction of combinatorial protein libraries. , 2001, Nucleic acids research.

[12]  Anthony D. Keefe,et al.  Functional proteins from a random-sequence library , 2001, Nature.

[13]  P. Harbury,et al.  Reverse engineering the (β/α)8 barrel fold , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Anthony D. Keefe,et al.  Constructing high complexity synthetic libraries of long ORFs using in vitro selection. , 2000, Journal of molecular biology.

[15]  R. Levy,et al.  Simplified amino acid alphabets for protein fold recognition and implications for folding. , 2000, Protein engineering.

[16]  Jun Wang,et al.  A computational approach to simplifying the protein folding alphabet , 1999, Nature Structural Biology.

[17]  H. Dyson,et al.  Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. , 1999, Journal of molecular biology.

[18]  N. Doi,et al.  STABLE: protein‐DNA fusion system for screening of combinatorial protein libraries in vitro , 1999, FEBS letters.

[19]  K. Yoshida,et al.  Foldability of barnase mutants obtained by permutation of modules or secondary structure units. , 1999, Journal of molecular biology.

[20]  T. Yomo,et al.  Characterization of random‐sequence proteins displayed on the surface of Escherichia coli RNase HI , 1998, FEBS letters.

[21]  N. Doi,et al.  Screening of conformationally constrained random polypeptide libraries displayed on a protein scaffold , 1998, Cellular and Molecular Life Sciences CMLS.

[22]  T. Yomo,et al.  Characterization of soluble artificial proteins with random sequences , 1998, FEBS letters.

[23]  D. Baker,et al.  Functional rapidly folding proteins from simplified amino acid sequences , 1997, Nature Structural Biology.

[24]  P. Stadler,et al.  Neutral networks in protein space: a computational study based on knowledge-based potentials of mean force. , 1997, Folding & design.

[25]  Y Husimi,et al.  In vitro virus: Bonding of mRNA bearing puromycin at the 3′‐terminal end to the C‐terminal end of its encoded protein on the ribosome in vitro , 1997, FEBS letters.

[26]  T. Noda,et al.  Creation of libraries with long ORFs by polymerization of a microgene. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[27]  T. Yomo,et al.  Insertion of foreign random sequences of 120 amino acid residues into an active enzyme , 1997, FEBS letters.

[28]  T. Yomo,et al.  Solubility of artificial proteins with random sequences , 1996, FEBS letters.

[29]  Robert T. Sauer,et al.  Cooperatively folded proteins in random sequence libraries , 1995, Nature Structural Biology.

[30]  S. Kauffman,et al.  Libraries of random-sequence polypeptides produced with high yield as carboxy-terminal fusions with ubiquitin , 1995, Molecular Diversity.

[31]  J. Waser,et al.  On the origin of the genetic code , 1994, FEBS letters.

[32]  R. Sauer,et al.  Folded proteins occur frequently in libraries of random amino acid sequences. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[33]  L. H. Bradley,et al.  Protein design by binary patterning of polar and nonpolar amino acids. , 1993, Methods in molecular biology.

[34]  V. Uversky Use of fast protein size-exclusion liquid chromatography to study the unfolding of proteins which denature through the molten globule. , 1993, Biochemistry.

[35]  David L. Wilkinson,et al.  Predicting the Solubility of Recombinant Proteins in Escherichia coli , 1991, Bio/Technology.

[36]  P. V. von Hippel,et al.  Calculation of protein extinction coefficients from amino acid sequence data. , 1989, Analytical biochemistry.

[37]  H. Schägger,et al.  Tricine-sodium dodecyl sulfate-polyacrylamide gel electrophoresis for the separation of proteins in the range from 1 to 100 kDa. , 1987, Analytical biochemistry.

[38]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[39]  J. Wong A co-evolution theory of the genetic code. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[40]  K. Kvenvolden,et al.  Evidence for Extraterrestrial Amino-acids and Hydrocarbons in the Murchison Meteorite , 1970, Nature.

[41]  F. H. C. CRICK,et al.  Origin of the Genetic Code , 1967, Nature.

[42]  L. Stryer,et al.  The interaction of a naphthalene dye with apomyoglobin and apohemoglobin. A fluorescent probe of non-polar binding sites. , 1965, Journal of molecular biology.

[43]  F. Teale,et al.  The ultraviolet fluorescence of proteins in neutral solution. , 1960, The Biochemical journal.

[44]  S. Miller A production of amino acids under possible primitive earth conditions. , 1953, Science.

[45]  M. Eigen,et al.  The Hypercycle , 2004, Naturwissenschaften.

[46]  G. Winter,et al.  A native-like artificial protein from antisense DNA. , 2004, Protein engineering, design & selection : PEDS.

[47]  Yinan Wei,et al.  Enzyme-like proteins from an unselected library of designed amino acid sequences. , 2004, Protein engineering, design & selection : PEDS.

[48]  Masamichi Ishizaka,et al.  In vitro selection of Jun-associated proteins using mRNA display. , 2004, Nucleic acids research.

[49]  T. Noda,et al.  Translated products of tandem microgene repeats exhibit diverse properties also seen in natural proteins. , 2003, Protein engineering.

[50]  W Mandecki,et al.  A method for construction of long randomized open reading frames and polypeptides. , 1990, Protein engineering.

[51]  P. Y. Chou,et al.  Empirical predictions of protein conformation. , 1978, Annual review of biochemistry.