Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation

The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation.

[1]  Nomenclature committee of the international union of biochemistry and molecular biology (NC-IUBMB), Enzyme Supplement 5 (1999). , 1999, European journal of biochemistry.

[2]  Motowo Nakajima,et al.  Modified substrate specificity of pyrroloquinoline quinone glucose dehydrogenase by biased mutation assembling with optimized amino acid substitution , 2006, Applied Microbiology and Biotechnology.

[3]  E. Lamani,et al.  Structural studies and mechanism of Saccharomyces cerevisiae dolichyl-phosphate-mannose synthase: insights into the initial step of synthesis of dolichyl-phosphate-linked oligosaccharide chains in membranes of endoplasmic reticulum. , 2006, Glycobiology.

[4]  Heidi J. Imker,et al.  The Enzyme Function Initiative. , 2011, Biochemistry.

[5]  Patricia C. Babbitt,et al.  Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies , 2009, PLoS Comput. Biol..

[6]  Peter D. Karp,et al.  A survey of orphan enzyme activities , 2007, BMC Bioinformatics.

[7]  D. Banerjee,et al.  In Vitro Phosphorylation by cAMP-dependent Protein Kinase Up-regulates Recombinant Saccharomyces cerevisiae Mannosylphosphodolichol Synthase* , 2005, Journal of Biological Chemistry.

[8]  K. Matsushita,et al.  D-fructose dehydrogenase of Gluconobacter industrius: purification, characterization, and application to enzymatic microdetermination of D-fructose , 1981, Journal of bacteriology.

[9]  K. Matsushita,et al.  Formation of the Apo-form of Quinoprotein Alcohol Dehydrogenase from Gluconobacter suboxydans , 1989 .

[10]  D. McPherson,et al.  Characterization of recombinant yeast dolichyl mannosyl phosphate synthase and site-directed mutagenesis of its cysteine residues. , 1997, European journal of biochemistry.

[11]  Dmitrij Frishman,et al.  Protein annotation at genomic scale: the current status. , 2007, Chemical reviews.

[12]  L. L. Lloyd,et al.  Enzyme nomenclature — Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology: Academic Press Ltd, London, UK, 1992. xiii + 862 pp. Price £40.00. ISBN 0-12-227165-3 , 1994 .

[13]  Thomas E. Ferrin,et al.  Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies , 2009, PloS one.

[14]  C. Locht,et al.  Ppm1, a novel polyprenol monophosphomannose synthase from Mycobacterium tuberculosis. , 2002, The Biochemical journal.

[15]  Y. Maeda,et al.  A Homologue of Saccharomyces cerevisiae Dpm1p Is Not Sufficient for Synthesis of Dolichol-Phosphate-Mannose in Mammalian Cells* , 1998, The Journal of Biological Chemistry.

[16]  P. Orlean,et al.  Human and Saccharomyces cerevisiae dolichol phosphate mannose synthases represent two classes of the enzyme, but both function in Schizosaccharomyces pombe. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[17]  M Suzuki,et al.  A novel enzyme, maltose 1‐epimerase from Lactobacillus brevis IFO 3345 , 1995, FEBS letters.

[18]  E. Dawes,et al.  The regulation of transport of glucose and methyl α-glucoside in Pseudomonas aeruginosa , 1973 .

[19]  A Bairoch,et al.  Go hunting in sequence databases but watch out for the traps. , 1996, Trends in genetics : TIG.

[20]  J. Rush,et al.  Partial purification of mannosylphosphorylundecaprenol synthase from Micrococcus luteus: a useful enzyme for the biosynthesis of a variety of mannosylphosphorylpolyisoprenol products. , 2006, Methods in molecular biology.

[21]  E. Dawes,et al.  The regulation of transport of glucose and methyl alpha-glucoside in Pseudomonas aeruginosa. , 1973, The Biochemical journal.

[22]  I. Matsui,et al.  A thermostable dolichol phosphoryl mannose synthase responsible for glycoconjugate synthesis of the hyperthermophilic archaeon Pyrococcus horikoshii , 2008, Extremophiles.

[23]  Olivier Lespinet,et al.  Orphan Enzymes? , 2005, Science.

[24]  K. Matsushita,et al.  Reactivity with ubiquinone of quinoprotein D-glucose dehydrogenase from Gluconobacter suboxydans. , 1989, Journal of biochemistry.

[25]  C. Anthony,et al.  Characterization of the membrane quinoprotein glucose dehydrogenase from Escherichia coli and characterization of a site-directed mutant in which histidine-262 has been changed to tyrosine. , 1999, The Biochemical journal.

[26]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[27]  B. Kagan,et al.  Amyloidosis and Protein Folding , 2005, Science.

[28]  D. Banerjee,et al.  Low expression of lipid-linked oligosaccharide due to a functionally altered Dol-P-Man synthase reduces protein glycosylation in cAMP-dependent protein kinase deficient Chinese hamster ovary cells , 2004, Glycoconjugate Journal.

[29]  A. Barrett,et al.  Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme Nomenclature. Recommendations 1992. Supplement 4: corrections and additions (1997). , 1997, European journal of biochemistry.

[30]  K. Tipton,et al.  Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). Enzyme nomenclature. Recommendations 1992. Supplement: corrections and additions. , 1994, European journal of biochemistry.