Using a residue clash map to functionally characterize protein recombination hybrids.

In this article, we introduce a rapid, protein sequence database-driven approach to characterize all contacting residue pairs present in protein hybrids for inconsistency with protein family structural features. This approach is based on examining contacting residue pairs with different parental origins for different types of potentially unfavorable interactions (i.e. electrostatic repulsion, steric hindrance, cavity formation and hydrogen bond disruption). The identified clashing residue pairs between members of a protein family are then contrasted against functionally characterized hybrid libraries. Comparisons for five different protein recombination studies available in the literature: (i) glycinamide ribonucleotide transformylase (GART) from Escherichia coli (purN) and human (hGART), (ii) human Mu class glutathione S-transferase (GST) M1-1 and M2-2, (iii) beta-lactamase TEM-1 and PSE-4, (iv) catechol-2,3-oxygenase xylE and nahH, and (v) dioxygenases (toluene dioxygenase, tetrachlorobenzene dioxygenase and biphenyl dioxygenase) reveal that the patterns of identified clashing residue pairs are remarkably consistent with experimentally found patterns of functional crossover profiles. Specifically, we show that the proposed residue clash maps are on average 5.0 times more effective than randomly generated clashes and 1.6 times more effective than residue contact maps at explaining the observed crossover distributions among functional members of hybrid libraries. This suggests that residue clash maps can provide quantitative guidelines for the placement of crossovers in the design of protein recombination experiments.

[1]  Frances H Arnold,et al.  Analysis of shuffled gene libraries. , 2002, Journal of molecular biology.

[2]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[3]  S. Harayama,et al.  An effective family shuffling method using single-stranded DNA. , 2000, Gene.

[4]  R. Varadarajan,et al.  Thermodynamic and structural studies of cavity formation in proteins suggest that loss of packing interactions rather than the hydrophobic effect dominates the observed energetics. , 2000, Biochemistry.

[5]  Q. Park,et al.  A molecular model of a point mutation (Val297Met) in the serine protease domain of protein C , 1999, Experimental and Molecular Medicine.

[6]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[7]  M C Peitsch,et al.  Protein structure computing in the genomic era. , 2000, Research in microbiology.

[8]  Costas D Maranas,et al.  Using multiple sequence correlation analysis to characterize functionally important protein regions. , 2003, Protein engineering.

[9]  Volker Sieber,et al.  Libraries of hybrid proteins from distantly related sequences , 2001, Nature Biotechnology.

[10]  P. Dupraz,et al.  Point mutations in the proximal Cys-His box of Rous sarcoma virus nucleocapsid protein , 1990, Journal of virology.

[11]  M C Peitsch,et al.  ProMod and Swiss-Model: Internet-based tools for automated comparative protein modelling. , 1996, Biochemical Society transactions.

[12]  W. Saenger,et al.  Functional Role of Cα–H⋯O Hydrogen Bonds Between Transmembrane α-Helices in Photosystem I , 2003 .

[13]  F. Arnold,et al.  Strategies for the in vitro evolution of protein function: enzyme evolution by random recombination of improved sequences. , 1997, Journal of molecular biology.

[14]  P. Chakrabarti Anion binding sites in protein structures. , 1993, Journal of molecular biology.

[15]  Marc Ostermeier,et al.  Theoretical distribution of truncation lengths in incremental truncation libraries. , 2003, Biotechnology and bioengineering.

[16]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[17]  G. Barton,et al.  A structural analysis of phosphate and sulphate binding sites in proteins. Estimation of propensities for binding and conservation of phosphate binding sites. , 1994, Journal of molecular biology.

[18]  L. Hansson,et al.  Evolution of differential substrate specificities in Mu class glutathione transferases probed by DNA shuffling. , 1999, Journal of molecular biology.

[19]  Frances H Arnold,et al.  Library analysis of SCHEMA‐guided protein recombination , 2003, Protein science : a publication of the Protein Society.

[20]  Frances H Arnold,et al.  General method for sequence-independent site-directed chimeragenesis. , 2003, Journal of molecular biology.

[21]  T. N. Bhat,et al.  The Protein Data Bank: unifying the archive , 2002, Nucleic Acids Res..

[22]  Manuel C. Peitsch,et al.  SWISS-MODEL: an automated protein homology-modeling server , 2003, Nucleic Acids Res..

[23]  T J Oldfield,et al.  Data mining the protein data bank: Residue interactions , 2002, Proteins.