Protein Fragment Swapping: A Method for Asymmetric, Selective Site-Directed Recombination

This article presents a new approach to site-directed recombination, swapping combinations of selected discontiguous fragments from a source protein in place of corresponding fragments of a target protein. By being both asymmetric (differentiating source and target) and selective (swapping discontiguous fragments), our method focuses experimental effort on a more restricted portion of sequence space, constructing hybrids that are more likely to have the properties that are the objective of the experiment. Furthermore, since the source and target need to be structurally homologous only locally (rather than overall), our method supports swapping fragments from functionally important regions of a source into a target "scaffold" (for example, to humanize an exogenous therapeutic protein). A protein fragment swapping plan is defined by the residue position boundaries of the fragments to be swapped; it is assessed by an average potential score over the resulting hybrid library, with singleton and pairwise terms evaluating the importance and fit of the swapped residues. While we prove that it is NP-hard to choose an optimal set of fragments under such a potential score, we develop an integer programming approach, which we call Swagmer, that works very well in practice. We demonstrate the effectiveness of our method in three swapping problems: selective recombination between beta-lactamases, activity swapping between glutathione transferases, and activity swapping between carboxylases and mutases in the purE family. We show that the selective recombination approach generates better plan (in terms of resulting potential score) than traditional site-directed recombination approaches. We also show that in all cases the optimized experiments are significantly better than ones that would result from stochastic methods.

[1]  Valérie Taly,et al.  A combinatorial approach to substrate discrimination in the P450 CYP1A subfamily. , 2007, Biochimica et biophysica acta.

[2]  Temple F. Smith,et al.  Global optimum protein threading with gapped alignment and empirical pair score functions. , 1996, Journal of molecular biology.

[3]  P. T. Jones,et al.  Replacing the complementarity-determining regions in a human antibody with those from a mouse , 1986, Nature.

[4]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[5]  Chris Bailey-Kellogg,et al.  Hypergraph Model of Multi-Residue Interactions in Proteins: Sequentially-Constrained Partitioning Algorithms for Optimization of Site-Directed Protein Recombination , 2007, J. Comput. Biol..

[6]  S M Firestine,et al.  Reactions catalyzed by 5-aminoimidazole ribonucleotide carboxylases from Escherichia coli and Gallus gallus: a case for divergent catalytic mechanisms. , 1994, Biochemistry.

[7]  Chris Bailey-Kellogg,et al.  Site‐directed combinatorial construction of chimaeric genes: General method for optimizing assembly of gene fragments , 2006, Proteins.

[8]  W. P. Russ,et al.  Evolutionary information for specifying a protein fold , 2005, Nature.

[9]  George Georgiou,et al.  The evolution of catalytic efficiency and substrate promiscuity in human theta class 1-1 glutathione transferase. , 2006, Journal of molecular biology.

[10]  Costas D Maranas,et al.  Identifying residue–residue clashes in protein hybrids by using a second-order mean-field approach , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Chris Bailey-Kellogg,et al.  Robotic hierarchical mixing for the production of combinatorial libraries of proteins and small molecules. , 2008, Journal of combinatorial chemistry.

[12]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[13]  Paul E O'Maille,et al.  Structure-based combinatorial protein engineering (SCOPE). , 2002, Journal of Molecular Biology.

[14]  Jeffrey B. Endelman,et al.  Structure-Guided Recombination Creates an Artificial Family of Cytochromes P450 , 2006, PLoS biology.

[15]  S L Morrison,et al.  Chimeric human antibody molecules: mouse antigen-binding domains with human constant region domains. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[16]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[17]  Linda A. Castle,et al.  Discovery and Directed Evolution of a Glyphosate Tolerance Gene , 2004, Science.

[18]  C. Bailey-Kellogg,et al.  Graphical Models of Residue Coupling in Protein Families , 2008, TCBB.

[19]  H. Wolfson,et al.  Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[20]  W. P. Russ,et al.  Natural-like function in artificial WW domains , 2005, Nature.

[21]  C D Maranas,et al.  Creating multiple-crossover DNA libraries independent of sequence identity , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[23]  Frances H Arnold,et al.  Library analysis of SCHEMA‐guided protein recombination , 2003, Protein science : a publication of the Protein Society.

[24]  W. Coco,et al.  RACHITT: Gene family shuffling by Random Chimeragenesis on Transient Templates. , 2003, Methods in molecular biology.

[25]  Stephen J Benkovic,et al.  Evolution of highly active enzymes by homology-independent recombination. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Costas D Maranas,et al.  Design of combinatorial protein libraries of optimal size , 2005, Proteins.

[27]  Adam Godzik,et al.  Fold recognition methods. , 2005, Methods of biochemical analysis.

[28]  Sanela Kurtovic,et al.  Structural determinants of glutathione transferases with azathioprine activity identified by DNA shuffling of alpha class members. , 2008, Journal of molecular biology.

[29]  Ying Xu,et al.  Raptor: Optimal Protein Threading by Linear Programming , 2003, J. Bioinform. Comput. Biol..

[30]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[31]  W. Stemmer Rapid evolution of a protein in vitro by DNA shuffling , 1994, Nature.

[32]  Frances H Arnold,et al.  Staggered extension process (StEP) in vitro recombination. , 2003, Methods in molecular biology.

[33]  Marc Ostermeier,et al.  Finding Cinderella's slipper—proteins that fit , 1999, Nature Biotechnology.

[34]  Stephen J Benkovic,et al.  FamClash: A method for ranking the activity of engineered enzymes , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Christopher A. Voigt,et al.  Protein building blocks preserved by recombination , 2002, Nature Structural Biology.

[36]  Christopher A. Voigt,et al.  Functional evolution and structural conservation in chimeric cytochromes p450: calibrating a structure-guided approach. , 2004, Chemistry & biology.

[37]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.