Cofactory: Sequence‐based prediction of cofactor specificity of Rossmann folds

Obtaining optimal cofactor balance to drive production is a challenge in metabolically engineered microbial production strains. To facilitate identification of heterologous enzymes with desirable altered cofactor requirements from native content, we have developed Cofactory, a method for prediction of enzyme cofactor specificity using only primary amino acid sequence information. The algorithm identifies potential cofactor binding Rossmann folds and predicts the specificity for the cofactors FAD(H2), NAD(H), and NADP(H). The Rossmann fold sequence search is carried out using hidden Markov models whereas artificial neural networks are used for specificity prediction. Training was carried out using experimental data from protein–cofactor structure complexes. The overall performance was benchmarked against an independent evaluation set obtaining Matthews correlation coefficients of 0.94, 0.79, and 0.65 for FAD(H2), NAD(H), and NADP(H), respectively. The Cofactory method is made publicly available at http://www.cbs.dtu.dk/services/Cofactory. Proteins 2014; 82:1819–1828. © 2014 Wiley Periodicals, Inc.

[1]  Adam M. Feist,et al.  Optimizing Cofactor Specificity of Oxidoreductase Enzymes for the Generation of Microbial Production Strains—OptSwap , 2013 .

[2]  Michael G. Rossmann,et al.  Chemical and biological evolution of a nucleotide-binding protein , 1974, Nature.

[3]  S. Brunak,et al.  SignalP 4.0: discriminating signal peptides from transmembrane regions , 2011, Nature Methods.

[4]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[5]  Antje Chang,et al.  BRENDA in 2013: integrated reactions, kinetic data, enzyme function data, improved disease classification: new options and contents in BRENDA , 2012, Nucleic Acids Res..

[6]  K. Britton,et al.  Structural consequences of sequence patterns in the fingerprint region of the nucleotide binding fold. Implications for nucleotide specificity. , 1992, Journal of molecular biology.

[7]  M. Penttilä,et al.  Engineering Redox Cofactor Regeneration for Improved Pentose Fermentation in Saccharomyces cerevisiae , 2003, Applied and Environmental Microbiology.

[8]  The UniProt Consortium,et al.  Reorganizing the protein space at the Universal Protein Resource (UniProt) , 2011, Nucleic Acids Res..

[9]  Gabriele Ausiello,et al.  Nucleos: a web server for the identification of nucleotide-binding sites in protein structures , 2013, Nucleic Acids Res..

[10]  M. Bewley,et al.  Engineering and characterization of a NADPH-utilizing cytochrome b5 reductase. , 2003, Biochemistry.

[11]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[12]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[13]  George A. Khoury,et al.  Computational design of Candida boidinii xylose reductase for altered cofactor specificity , 2009, Protein science : a publication of the Protein Society.

[14]  Nigel S. Scrutton,et al.  Redesign of the coenzyme specificity of a dehydrogenase by protein engineering , 1990, Nature.

[15]  A. Konagurthu,et al.  MUSTANG: A multiple structural alignment algorithm , 2006, Proteins.

[16]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[17]  V. Rodwell,et al.  Dual coenzyme specificity of Archaeoglobus fulgidus HMG‐CoA reductase , 2000, Protein science : a publication of the Protein Society.

[18]  R. Wierenga,et al.  INTERACTION OF PYROPHOSPHATE MOIETIES WITH ALPHA-HELIXES IN DINUCLEOTIDE BINDING-PROTEINS , 1985 .

[19]  G. Bennett,et al.  Effect of Overexpression of a Soluble Pyridine Nucleotide Transhydrogenase (UdhA) on the Production of Poly(3‐hydroxybutyrate) in Escherichia coli , 2006, Biotechnology progress.

[20]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[21]  Gajendra P. S. Raghava,et al.  Identification of NAD interacting residues in proteins , 2010, BMC Bioinformatics.

[22]  Ka-Yiu San,et al.  Metabolic engineering of Escherichia coli: increase of NADH availability by overexpressing an NAD(+)-dependent formate dehydrogenase. , 2002, Metabolic engineering.

[23]  Gajendra P. S. Raghava,et al.  Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information , 2010, BMC Bioinformatics.

[24]  O. Dym,et al.  Sequence‐structure analysis of FAD‐containing proteins , 2001, Protein science : a publication of the Protein Society.

[25]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[26]  Javier Herrero,et al.  Toward community standards in the quest for orthologs , 2012, Bioinform..

[27]  Daniel W. A. Buchan,et al.  A large-scale evaluation of computational protein function prediction , 2013, Nature Methods.

[28]  Y. Kallberg,et al.  Prediction of coenzyme specificity in dehydrogenases/reductases. A hidden Markov model-based method and its application on complete genomes. , 2006, The FEBS journal.

[29]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[30]  Ka-Yiu San,et al.  Replacing Escherichia coli NAD-dependent glyceraldehyde 3-phosphate dehydrogenase (GAPDH) with a NADP-dependent enzyme from Clostridium acetobutylicum facilitates NADPH dependent pathways. , 2008, Metabolic engineering.

[31]  S. Kravitz,et al.  CAMERA: A Community Resource for Metagenomics , 2007, PLoS biology.

[32]  P. Trost,et al.  Dual coenzyme specificity of photosynthetic glyceraldehyde-3-phosphate dehydrogenase interpreted by the crystal structure of A4 isoform complexed with NAD. , 2003, Biochemistry.

[33]  I-Min A. Chen,et al.  The Genomes OnLine Database (GOLD) v.4: status of genomic and metagenomic projects and their associated metadata , 2011, Nucleic Acids Res..

[34]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[35]  Stephen P. Miller,et al.  The Biochemical Architecture of an Ancient Adaptive Landscape , 2005, Science.

[36]  G. Bennett,et al.  Metabolic engineering and transhydrogenase effects on NADPH availability in escherichia coli , 2013, Biotechnology progress (Print).

[37]  Robert D. Finn,et al.  HMMER web server: interactive sequence similarity searching , 2011, Nucleic Acids Res..

[38]  G. Chang,et al.  Determinants of the Dual Cofactor Specificity and Substrate Cooperativity of the Human Mitochondrial NAD(P)+-dependent Malic Enzyme , 2006, Journal of Biological Chemistry.

[39]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[40]  Arun Siddharth Konagurthu,et al.  Super: a web server to rapidly screen superposable oligopeptide fragments from the protein data bank , 2012, Nucleic Acids Res..

[41]  Y. Kallberg,et al.  Prediction of coenzyme specificity in dehydrogenases/ reductases , 2006 .