Strategies and computational tools for improving randomized protein libraries.

In the last decade, directed evolution has become a routine approach for engineering proteins with novel or altered properties. Concurrently, a trend away from purely 'blind' randomization strategies and towards more 'semi-rational' approaches has also become apparent. In this review, we discuss ways in which structural information and predictive computational tools are playing an increasingly important role in guiding the design of randomized libraries: web servers such as ConSurf-HSSP and SCHEMA allow the prediction of sites to target for producing functional variants, while algorithms such as GLUE, PEDEL and DRIVeR are useful for estimating library completeness and diversity. In addition, we review recent methodological developments that facilitate the construction of unbiased libraries, which are inherently more diverse than biased libraries and therefore more likely to yield improved variants.

[1]  J M Masson,et al.  Crystal structure of Escherichia coli TEM1 β‐lactamase at 1.8 Å resolution , 1993, Proteins.

[2]  Andrew E. Firth,et al.  Statistics of protein library construction , 2005, Bioinform..

[3]  B. Connolly,et al.  Low-fidelity Pyrococcus furiosus DNA polymerase mutants useful in error-prone PCR. , 2004, Nucleic acids research.

[4]  G. F. Joyce,et al.  Randomization of genes by PCR mutagenesis. , 1992, PCR methods and applications.

[5]  C D Maranas,et al.  Creating multiple-crossover DNA libraries independent of sequence identity , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  W. Stemmer Rapid evolution of a protein in vitro by DNA shuffling , 1994, Nature.

[7]  Cameron Neylon,et al.  Chemical and biochemical strategies for the randomization of protein encoding DNA sequences: library construction methods for directed evolution. , 2004, Nucleic acids research.

[8]  Jon E. Ness,et al.  Predicting the emergence of antibiotic resistance by directed evolution and structural analysis , 2001, Nature Structural Biology.

[9]  Frances H Arnold,et al.  Library analysis of SCHEMA‐guided protein recombination , 2003, Protein science : a publication of the Protein Society.

[10]  Claes Gustafsson,et al.  Predicting enzyme function from protein sequence. , 2005, Current opinion in chemical biology.

[11]  L. Schwimmer,et al.  Creation and discovery of ligand-receptor pairs for transcriptional control with small molecules. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Chris Sander,et al.  Protein folds and families: sequence and structure alignments , 1999, Nucleic Acids Res..

[13]  Christophe Ampe,et al.  Reducing mutational bias in random protein libraries. , 2005, Analytical biochemistry.

[14]  S. Benkovic,et al.  Rapid generation of incremental truncation libraries for protein engineering using alpha-phosphothioate nucleotides. , 2001, Nucleic acids research.

[15]  Markus Wiederstein,et al.  Protein sequence randomization: efficient estimation of protein stability using knowledge-based potentials. , 2005, Journal of molecular biology.

[16]  Marc Ostermeier,et al.  Mathematical expressions useful in the construction, description and evaluation of protein libraries. , 2005, Biomolecular engineering.

[17]  Frances H Arnold,et al.  General method for sequence-independent site-directed chimeragenesis. , 2003, Journal of molecular biology.

[18]  R. Roberts,et al.  In vitro selection of nucleic acids and proteins: What are we learning? , 1999, Current opinion in structural biology.

[19]  Anna V Hine,et al.  Removing the redundancy from randomised gene libraries. , 2003, Journal of molecular biology.

[20]  Ichiro Matsumura,et al.  A comparison of directed evolution approaches using the beta-glucuronidase model system. , 2003, Journal of molecular biology.

[21]  R. Kazlauskas,et al.  Improving enzyme properties: when are closer mutations better? , 2005, Trends in biotechnology.

[22]  Volker Sieber,et al.  Libraries of hybrid proteins from distantly related sequences , 2001, Nature Biotechnology.

[23]  C D Maranas,et al.  Predicting crossover generation in DNA shuffling , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Frances H. Arnold,et al.  Molecular evolution by staggered extension process (StEP) in vitro recombination , 1998, Nature Biotechnology.

[25]  Costas D Maranas,et al.  Using multiple sequence correlation analysis to characterize functionally important protein regions. , 2003, Protein engineering.

[26]  Costas D Maranas,et al.  Identifying residue–residue clashes in protein hybrids by using a second-order mean-field approach , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Frances H Arnold,et al.  Why high-error-rate random mutagenesis libraries are enriched in functional and improved proteins. , 2004, Journal of molecular biology.

[28]  Spencer J. Williams,et al.  Glycosynthases: Mutant Glycosidases for Glycoside Synthesis , 2002 .

[29]  Wayne M Patrick,et al.  Novel methods for directed evolution of enzymes: quality, not quantity. , 2004, Current opinion in biotechnology.

[30]  Fengzhu Sun,et al.  The Polymerase Chain Reaction and Branching Processes , 1995, J. Comput. Biol..

[31]  Jon E. Ness,et al.  Synthetic shuffling expands functional protein diversity by allowing amino acids to recombine independently , 2002, Nature Biotechnology.

[32]  S. Blacklow,et al.  A reliable method for random mutagenesis: the generation of mutant libraries using spiked oligodeoxyribonucleotide primers. , 1989, Gene.

[33]  J. Wong,et al.  Role of minimization of chemical distances between amino acids in the evolution of the genetic code. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Philip T. Pienkos,et al.  Growth factor engineering by degenerate homoduplex gene family recombination , 2002, Nature Biotechnology.

[35]  Christopher A. Voigt,et al.  Protein building blocks preserved by recombination , 2002, Nature Structural Biology.

[36]  Christopher A. Voigt,et al.  Functional evolution and structural conservation in chimeric cytochromes p450: calibrating a structure-guided approach. , 2004, Chemistry & biology.

[37]  Mats Holmquist,et al.  Focusing mutations into the P. fluorescens esterase binding site increases enantioselectivity more effectively than distant mutations. , 2005, Chemistry & biology.

[38]  S. Brenner A tour of structural genomics , 2001, Nature Reviews Genetics.

[39]  W. Stemmer DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[40]  N. Ben-Tal,et al.  The ConSurf‐HSSP database: The mapping of evolutionary conservation among homologs onto PDB structures , 2004, Proteins.

[41]  R. Siegel,et al.  Generation of large libraries of random mutants in Bacillus subtilis by PCR-based plasmid multimerization. , 1997, BioTechniques.

[42]  Costas D. Maranas,et al.  Computational challenges in combinatorial library design for protein engineering , 2004 .

[43]  Marc Ostermeier,et al.  Finding Cinderella's slipper—proteins that fit , 1999, Nature Biotechnology.

[44]  J. Knowles,et al.  Searching sequence space by definably random mutagenesis: improving the catalytic potency of an enzyme. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Rajan Sankaranarayanan,et al.  Structural basis of selection and thermostability of laboratory evolved Bacillus subtilis lipase. , 2004, Journal of molecular biology.

[46]  Frances H. Arnold,et al.  When blind is better: Protein design by evolution , 1998, Nature Biotechnology.

[47]  M. Dufton The significance of redundancy in the genetic code. , 1983, Journal of Theoretical Biology.

[48]  Anna V. Hine,et al.  Discovery of active proteins directly from combinatorial randomized protein libraries without display, purification or sequencing: identification of novel zinc finger proteins , 2005, Nucleic acids research.

[49]  Stephen J Benkovic,et al.  FamClash: A method for ranking the activity of engineered enzymes , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[50]  L. Passmore,et al.  Insights into the molecular basis for the carbenicillinase activity of PSE-4 beta-lactamase from crystallographic and kinetic studies. , 2001, Biochemistry.

[51]  Marc Ostermeier,et al.  A combinatorial approach to hybrid enzymes independent of DNA homology , 1999, Nature Biotechnology.

[52]  Wayne M Patrick,et al.  A second-generation system for unbiased reading frame selection. , 2004, Protein engineering, design & selection : PEDS.

[53]  M. Deem,et al.  Modulation of Base-Specific Mutation and Recombination Rates EnablesFunctional Adaptation Within the Context of the Genetic Code , 2004, Journal of Molecular Evolution.

[54]  Wayne M Patrick,et al.  User-friendly algorithms for estimating completeness and diversity in randomized protein-encoding libraries. , 2003, Protein engineering.

[55]  Narendra Maheshri,et al.  Computational and experimental analysis of DNA shuffling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[56]  Gavin J. Williams,et al.  Directed evolution of enzymes for biocatalysis and the life sciences , 2004, Cellular and Molecular Life Sciences CMLS.

[57]  Claes Gustafsson,et al.  Systematic variation of amino acid substitutions for stringent assessment of pairwise covariation. , 2003, Journal of molecular biology.

[58]  Costas D Maranas,et al.  Predicting out-of-sequence reassembly in DNA shuffling. , 2002, Journal of theoretical biology.