Identifying residue–residue clashes in protein hybrids by using a second-order mean-field approach

In this article, a second-order mean-field-based approach is introduced for characterizing the complete set of residue–residue couplings consistent with a given protein structure. This information is subsequently used to classify protein hybrids with respect to their potential to be functional based on the presence/absence and severity of clashing residue–residue interactions. First, atomistic representations of both the native and denatured states are used to calculate rotamer–backbone, rotamer–intrinsic, and rotamer–rotamer conformational energies. Next, this complete conformational energy table is coupled with a second-order mean-field description to elucidate the probabilities of all possible rotamer–rotamer combinations in a minimum Helmholtz free-energy ensemble. Computational results for the dihydrofolate reductase family reveal correlation in substitution patterns between not only contacting but also distal second-order structural elements. Residue–residue clashes in hybrid proteins are quantified by contrasting the ensemble probabilities of protein hybrids against the ones of the original parental sequences. Good agreement with experimental data is demonstrated by superimposing these clashes against the functional crossover profiles of bidirectional incremental truncation libraries for Escherichia coli and human glycinamide ribonucleotide transformylases.

[1]  M. Levitt,et al.  De novo protein design. I. In search of stability and specificity. , 1999, Journal of molecular biology.

[2]  H. Bethe Statistical Theory of Superlattices , 1935 .

[3]  J. Kraut,et al.  Loop and subdomain movements in the mechanism of Escherichia coli dihydrofolate reductase: crystallographic evidence. , 1997, Biochemistry.

[4]  I. Wilson,et al.  Towards structure-based drug design: crystal structure of a multisubstrate adduct complex of glycinamide ribonucleotide transformylase at 1.96 A resolution. , 1995, Journal of molecular biology.

[5]  F. Arnold,et al.  Designed evolution of enzymatic properties. , 2000, Current opinion in biotechnology.

[6]  C. Lee,et al.  Predicting protein mutant energetics by self-consistent ensemble optimization. , 1994, Journal of molecular biology.

[7]  C D Maranas,et al.  Creating multiple-crossover DNA libraries independent of sequence identity , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[9]  Marc Ostermeier,et al.  A combinatorial approach to hybrid enzymes independent of DNA homology , 1999, Nature Biotechnology.

[10]  Maximiliano Vásquez,et al.  An evaluvation of discrete and continuum search techniques for conformational analysis of side chains in proteins , 1995 .

[11]  S J Wodak,et al.  Automatic protein design with all atom force-fields by exact and heuristic optimization. , 2000, Journal of molecular biology.

[12]  D. Eisenberg,et al.  Atomic solvation parameters applied to molecular dynamics of proteins in solution , 1992, Protein science : a publication of the Protein Society.

[13]  C. Brooks,et al.  Protein Dynamics in Enzymatic Catalysis: Exploration of Dihydrofolate Reductase , 2000 .

[14]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[15]  P. Koehl,et al.  A self consistent mean field approach to simultaneous gap closure and side-chain positioning in homology modelling , 1995, Nature Structural Biology.

[16]  S L Mayo,et al.  Pairwise calculation of protein solvent-accessible surface areas. , 1998, Folding & design.

[17]  M. Deem,et al.  A hierarchical approach to protein molecular evolution. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[18]  S. Benkovic,et al.  Incremental truncation as a strategy in the engineering of novel biocatalysts. , 1999, Bioorganic & medicinal chemistry.

[19]  C D Maranas,et al.  Modeling DNA mutation and recombination for directed evolution experiments. , 2000, Journal of theoretical biology.

[20]  S. Benkovic,et al.  Rapid generation of incremental truncation libraries for protein engineering using alpha-phosphothioate nucleotides. , 2001, Nucleic acids research.

[21]  J. Bolin,et al.  Crystal structures of Escherichia coli and Lactobacillus casei dihydrofolate reductase refined at 1.7 A resolution. I. General features and binding of methotrexate. , 1982, The Journal of biological chemistry.

[22]  A. Elcock Realistic modeling of the denatured states of proteins allows accurate calculations of the pH dependence of protein stability. , 1999, Journal of molecular biology.

[23]  Alexander D. MacKerell,et al.  All-atom empirical potential for molecular modeling and dynamics studies of proteins. , 1998, The journal of physical chemistry. B.

[24]  P. Koehl,et al.  Application of a self-consistent mean field theory to predict protein side-chains conformation and estimate their conformational entropy. , 1994, Journal of molecular biology.

[25]  Christopher A. Voigt,et al.  Protein building blocks preserved by recombination , 2002, Nature Structural Biology.

[26]  C D Maranas,et al.  Predicting crossover generation in DNA shuffling , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[27]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[28]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[29]  Volker Sieber,et al.  Libraries of hybrid proteins from distantly related sequences , 2001, Nature Biotechnology.

[30]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[31]  J G Saven,et al.  Statistical theory for protein combinatorial libraries. Packing interactions, backbone flexibility, and the sequence variability of a main-chain structure. , 2001, Journal of molecular biology.

[32]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[33]  S Brakmann,et al.  Discovery of Superior Enzymes by Directed Molecular Evolution , 2001, Chembiochem : a European journal of chemical biology.

[34]  C. Schmidt-Dannert Directed evolution of single proteins, metabolic pathways, and viruses. , 2001, Biochemistry.

[35]  Frances H. Arnold,et al.  Computational method to reduce the search space for directed protein evolution , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[36]  P. Agarwal,et al.  Network of coupled promoting motions in enzyme catalysis , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Andrew M Wollacott,et al.  Prediction of amino acid sequence from structure , 2000, Protein science : a publication of the Protein Society.

[38]  J G Saven,et al.  Statistical theory of combinatorial libraries of folding proteins: energetic discrimination of a target structure. , 2000, Journal of molecular biology.

[39]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.