Design of synthetic gene libraries encoding random sequence proteins with desired ensemble characteristics

Libraries of random sequence polypeptides are useful as sources of unevolved proteins, novel ligands, and potential lead compounds for the development of vaccines and therapeutics. The expression of small random peptides has been achieved previously using DNA synthesized with equimolar mixtures of nucleotides. For many potential uses of random polypeptide libraries, concerns such as avoiding termination codons and matching target amino acid compositions make more complex designs necessary. In this study, three mixtures of nucleotides, corresponding to the three positions in the codon, were designed such that semirandom DNA synthesized by repeated cycles of the three mixtures created an open reading frame encoding random sequence polypeptides with desired ensemble characteristics. Two methods were used to design the nucleotide mixtures: the manual use of a spreadsheet and a refining grid search algorithm. Using design targets of less than or equal to 1% stop codons and an amino acid composition based on the average ratios observed in natural, globular proteins, the search methods yielded similar nucleotide ratios. Semirandom DNA, synthesized with a designed, three‐residue repeat pattern, can encode libraries of very high diversity and represents an important tool for the construction of random polypeptide libraries.

[1]  S. Kauffman,et al.  Applied molecular evolution. , 1992, Journal of theoretical biology.

[2]  F. Young Biochemistry , 1955, The Indian Medical Gazette.

[3]  J. Scott,et al.  Discovering peptide ligands using epitope libraries. , 1992, Trends in biochemical sciences.

[4]  T. Hugli,et al.  Site‐specific mutations in the N‐terminal region of human C5a that affect interactions of C5a with the neutrophil C5a receptor , 1993, Protein science : a publication of the Protein Society.

[5]  J. Scott,et al.  Searching for peptide ligands with an epitope library. , 1990, Science.

[6]  W Mandecki,et al.  A method for construction of long randomized open reading frames and polypeptides. , 1990, Protein engineering.

[7]  A. Wlodawer,et al.  Hematopoietic cytokines: Similarities and differences in the structures, with implications for receptor binding , 1993, Protein science : a publication of the Protein Society.

[8]  M. Klapper,et al.  The independent distribution of amino acid near neighbor pairs into polypeptides. , 1977, Biochemical and biophysical research communications.

[9]  A. Arkin,et al.  Optimizing Nucleotide Mixtures to Encode Specific Subsets of Amino Acids for Semi-Random Mutagenesis , 1992, Bio/Technology.