Automated scaffold selection for enzyme design

A major goal of computational protein design is the construction of novel functions on existing protein scaffolds. There the first question is which scaffold is suitable for a specific reaction. Given a set of catalytic residues and their spatial arrangement, one wants to identify a protein scaffold that can host this active site. Here, we present an algorithm called ScaffoldSelection that is able to rapidly search large sets of protein structures for potential attachment sites of an enzymatic motif. The method consists of two steps; it first identifies pairs of backbone positions in pocket‐like regions. Then, it combines these to complete attachment sites using a graph theoretical approach. Identified matches are assessed for their ability to accommodate the substrate or transition state. A representative set of structures from the Protein Data Bank (∼3500) was searched for backbone geometries that support the catalytic residues for 12 chemical reactions. Recapitulation of native active site geometries is used as a benchmark for the performance of the program. The native motif is identified in all 12 test cases, ranking it in the top percentile in 5 out of 12. The algorithm is fast and efficient, although dependent on the complexity of the motif. Comparisons to other methods show that ScaffoldSelection performs equally well in terms of accuracy and far better in terms of speed. Thus, ScaffoldSelection will aid future computational protein design experiments by preselecting protein scaffolds that are suitable for a specific reaction type and the introduction of a predefined amino acid motif. Proteins 2009. © 2009 Wiley‐Liss, Inc.

[1]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[2]  D. Hilvert,et al.  Deciphering enzymes. Genetic selection as a probe of structure and mechanism. , 2004, European journal of biochemistry.

[3]  Eric A. Althoff,et al.  De Novo Computational Design of Retro-Aldol Enzymes , 2008, Science.

[4]  Eric A. Althoff,et al.  Kemp elimination catalysts by computational enzyme design , 2008, Nature.

[5]  F. Richards,et al.  Construction of new ligand binding sites in proteins of known structure. I. Computer-aided modeling of sites with pre-defined geometry. , 1991, Journal of molecular biology.

[6]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[7]  K N Houk,et al.  Quantum mechanical design of enzyme active sites. , 2008, The Journal of organic chemistry.

[8]  Hans-Peter Lenhof,et al.  BALL-rapid software prototyping in computational molecular biology , 2000, Bioinform..

[9]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[10]  H. Wolfson,et al.  Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[11]  F. Arnold,et al.  Evolving strategies for enzyme engineering. , 2005, Current opinion in structural biology.

[12]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[13]  F. Tanaka Catalytic antibodies as designer proteases and esterases. , 2002, Chemical reviews.

[14]  S. L. Mayo,et al.  Protein design automation , 1996, Protein science : a publication of the Protein Society.

[15]  H W Hellinga,et al.  Construction of a catalytically active iron superoxide dismutase by rational protein design. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[16]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[17]  Yi-Lei Zhao,et al.  How similar are enzyme active site geometries derived from quantum mechanical theozymes to crystal structures of enzyme‐inhibitor complexes? Implications for enzyme design , 2007, Protein science : a publication of the Protein Society.

[18]  De Yonker,et al.  A New Approach to Protein Design: Grafting of a Buried Transition Metal Binding Site into Escherichia coli Thioredoxin , 1992 .

[19]  D. E. Benson,et al.  Converting a maltose receptor into a nascent binuclear copper oxygenase by computational design. , 2002, Biochemistry.

[20]  J. Wells,et al.  Dissecting the catalytic triad of a serine protease , 1988, Nature.

[21]  K. Dill,et al.  Using quaternions to calculate RMSD , 2004, J. Comput. Chem..

[22]  J. Thornton,et al.  Tess: A geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites , 1997, Protein science : a publication of the Protein Society.

[23]  Jens Meiler,et al.  New algorithms and an in silico benchmark for computational enzyme design , 2006, Protein science : a publication of the Protein Society.

[24]  M. Schroeder,et al.  LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation , 2006, BMC Structural Biology.

[25]  F M Richards,et al.  Construction of new ligand binding sites in proteins of known structure. II. Grafting of a buried transition metal binding site into Escherichia coli thioredoxin. , 1991, Journal of molecular biology.

[26]  Frédéric Cazals,et al.  A note on the problem of reporting maximal cliques , 2008, Theor. Comput. Sci..

[27]  Stephen L Mayo,et al.  Computationally designed variants of Escherichia coli chorismate mutase show altered catalytic activity. , 2005, Protein engineering, design & selection : PEDS.

[28]  Robert B. Russell,et al.  Annotation in three dimensions. PINTS: Patterns in Non-homologous Tertiary Structures , 2003, Nucleic Acids Res..

[29]  Yanli Wang,et al.  MMDB: Entrez's 3D-structure database , 2003, Nucleic Acids Res..

[30]  H W Hellinga,et al.  Rational design of nascent metalloenzymes. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[31]  S. L. Mayo,et al.  Enzyme-like proteins by computational design , 2001, Proceedings of the National Academy of Sciences of the United States of America.