Protein Fragment Swapping: A Method for Asymmetric, Selective Site-Directed Recombination

This paper presents a new approach to site-directed recombination, swapping combinations of selected discontiguous fragments from a source protein in place of corresponding fragments of a target protein. By being both asymmetric (differentiating source and target) and selective (swapping discontiguous fragments), our method focuses experimental effort on a more restricted portion of sequence space, constructing hybrids that are more likely to have the properties that are the objective of the experiment. Furthermore, since the source and target need to be structurally homologous only locally (rather than overall), our method supports swapping fragments from functionally important regions of a source into a target "scaffold"; e.g., to humanize an exogenous therapeutic protein. A protein fragment swapping plan is defined by the residue position boundaries of the fragments to be swapped; it is assessed by an average potential score over the resulting hybrid library, with singleton and pairwise terms evaluating the importance and fit of the swapped residues. While we prove that it is NP-hard to choose an optimal set of fragments under such a potential score, we develop an integer programming approach, which we call Swagmer , that works very well in practice. We demonstrate the effectiveness of our method in two types of swapping problem: selective recombination between beta-lactamases and activity swapping between glutathione transferases. We show that the selective recombination approach generates a better plan (in terms of resulting potential score) than a traditional site-directed recombination approach. We also show that in both cases the optimized experiment is significantly better than one that would result from stochastic methods.

[1]  Christopher A. Voigt,et al.  Protein building blocks preserved by recombination , 2002, Nature Structural Biology.

[2]  W. Coco,et al.  RACHITT: Gene family shuffling by Random Chimeragenesis on Transient Templates. , 2003, Methods in molecular biology.

[3]  Christopher A. Voigt,et al.  Functional evolution and structural conservation in chimeric cytochromes p450: calibrating a structure-guided approach. , 2004, Chemistry & biology.

[4]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[5]  Jeffrey B. Endelman,et al.  Structure-Guided Recombination Creates an Artificial Family of Cytochromes P450 , 2006, PLoS biology.

[6]  Marc Ostermeier,et al.  Finding Cinderella's slipper—proteins that fit , 1999, Nature Biotechnology.

[7]  W. P. Russ,et al.  Natural-like function in artificial WW domains , 2005, Nature.

[8]  C D Maranas,et al.  Creating multiple-crossover DNA libraries independent of sequence identity , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Stephen J Benkovic,et al.  FamClash: A method for ranking the activity of engineered enzymes , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Valérie Taly,et al.  A combinatorial approach to substrate discrimination in the P450 CYP1A subfamily. , 2007, Biochimica et biophysica acta.

[11]  W. Stemmer Rapid evolution of a protein in vitro by DNA shuffling , 1994, Nature.

[12]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.

[13]  Chris Bailey-Kellogg,et al.  Site‐directed combinatorial construction of chimaeric genes: General method for optimizing assembly of gene fragments , 2006, Proteins.

[14]  W. P. Russ,et al.  Evolutionary information for specifying a protein fold , 2005, Nature.

[15]  Temple F. Smith,et al.  Global optimum protein threading with gapped alignment and empirical pair score functions. , 1996, Journal of molecular biology.

[16]  P. T. Jones,et al.  Replacing the complementarity-determining regions in a human antibody with those from a mouse , 1986, Nature.

[17]  Chris Bailey-Kellogg,et al.  Algorithms for Joint Optimization of Stability and Diversity in Planning Combinatorial Libraries of Chimeric Proteins , 2008, RECOMB.

[18]  Stephen J Benkovic,et al.  Evolution of highly active enzymes by homology-independent recombination. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Costas D Maranas,et al.  Design of combinatorial protein libraries of optimal size , 2005, Proteins.

[20]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[21]  Sanela Kurtovic,et al.  Structural determinants of glutathione transferases with azathioprine activity identified by DNA shuffling of alpha class members. , 2008, Journal of molecular biology.

[22]  Ying Xu,et al.  Raptor: Optimal Protein Threading by Linear Programming , 2003, J. Bioinform. Comput. Biol..

[23]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[24]  Costas D Maranas,et al.  Identifying residue–residue clashes in protein hybrids by using a second-order mean-field approach , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[25]  H. Wolfson,et al.  Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[26]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[27]  Adam Godzik,et al.  Fold recognition methods. , 2005, Methods of biochemical analysis.

[28]  Frances H Arnold,et al.  Library analysis of SCHEMA‐guided protein recombination , 2003, Protein science : a publication of the Protein Society.

[29]  Chris Bailey-Kellogg,et al.  Hypergraph Model of Multi-residue Interactions in Proteins: Sequentially-Constrained Partitioning Algorithms for Optimization of Site-Directed Protein Recombination , 2006, RECOMB.

[30]  George Georgiou,et al.  The evolution of catalytic efficiency and substrate promiscuity in human theta class 1-1 glutathione transferase. , 2006, Journal of molecular biology.

[31]  Frances H Arnold,et al.  Staggered extension process (StEP) in vitro recombination. , 2003, Methods in molecular biology.

[32]  Marc Ostermeier,et al.  A combinatorial approach to hybrid enzymes independent of DNA homology , 1999, Nature Biotechnology.

[33]  S L Morrison,et al.  Chimeric human antibody molecules: mouse antigen-binding domains with human constant region domains. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[34]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[35]  Linda A. Castle,et al.  Discovery and Directed Evolution of a Glyphosate Tolerance Gene , 2004, Science.

[36]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[37]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[38]  Chris Bailey-Kellogg,et al.  Robotic hierarchical mixing for the production of combinatorial libraries of proteins and small molecules. , 2008, Journal of combinatorial chemistry.

[39]  Chris Bailey-Kellogg,et al.  Graphical Models of Residue Coupling in Protein Families , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[40]  Paul E O'Maille,et al.  Structure-based combinatorial protein engineering (SCOPE). , 2002, Journal of Molecular Biology.