Retrieving sequences of enzymes experimentally characterized but erroneously annotated : the case of the putrescine carbamoyltransferase

BackgroundAnnotating genomes remains an hazardous task. Mistakes or gaps in such a complex process may occur when relevant knowledge is ignored, whether lost, forgotten or overlooked. This paper exemplifies an approach which could help to ressucitate such meaningful data.ResultsWe show that a set of closely related sequences which have been annotated as ornithine carbamoyltransferases are actually putrescine carbamoyltransferases. This demonstration is based on the following points : (i) use of enzymatic data which had been overlooked, (ii) rediscovery of a short NH2-terminal sequence allowing to reannotate a wrongly annotated ornithine carbamoyltransferase as a putrescine carbamoyltransferase, (iii) identification of conserved motifs allowing to distinguish unambiguously between the two kinds of carbamoyltransferases, and (iv) comparative study of the gene context of these different sequences.ConclusionsWe explain why this specific case of misannotation had not yet been described and draw attention to the fact that analogous instances must be rather frequent. We urge to be especially cautious when high sequence similarity is coupled with an apparent lack of biochemical information. Moreover, from the point of view of genome annotation, proteins which have been studied experimentally but are not correlated with sequence data in current databases qualify as "orphans", just as unassigned genomic open reading frames do. The strategy we used in this paper to bridge such gaps in knowledge could work whenever it is possible to collect a body of facts about experimental data, homology, unnoticed sequence data, and accurate informations about gene context.

[1]  Bernard Labedan,et al.  Using quaternary structures to assess the evolutionary history of proteins: the case of the aspartate carbamoyltransferase. , 2003, Molecular biology and evolution.

[2]  Amos Bairoch,et al.  The ENZYME database in 2000 , 2000, Nucleic Acids Res..

[3]  P. Babbitt Definitions of enzyme function for the structural genomics era. , 2003, Current opinion in chemical biology.

[4]  H. A. Barker,et al.  Fermentation of Agmatine in Streptococcus faecalis: Occurrence of Putrescine Transcarbamoylase , 1972, Journal of bacteriology.

[5]  Antje Chang,et al.  BRENDA , the enzyme database : updates and major new developments , 2003 .

[6]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[7]  Y. Itoh,et al.  Molecular Characterization and Regulation of theaguBA Operon, Responsible for Agmatine Utilization inPseudomonas aeruginosa PAO1 , 2001, Journal of bacteriology.

[8]  V. Rubio,et al.  Gene Structure, Organization, Expression, and Potential Regulatory Mechanisms of Arginine Catabolism in Enterococcus faecalis , 2002, Journal of bacteriology.

[9]  V. Stalon,et al.  Enzymes of agmatine degradation and the control of their synthesis in Streptococcus faecalis , 1982, Journal of bacteriology.

[10]  Patricia C Babbitt,et al.  Can sequence determine function? , 2000, Genome Biology.

[11]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[12]  T. A. Hall,et al.  BIOEDIT: A USER-FRIENDLY BIOLOGICAL SEQUENCE ALIGNMENT EDITOR AND ANALYSIS PROGRAM FOR WINDOWS 95/98/ NT , 1999 .

[13]  B Marshall,et al.  Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource , 2004, Nucleic Acids Res..

[14]  J L Risler,et al.  Phylogeny of related functions: the case of polyamine biosynthetic enzymes. , 2000, Microbiology.

[15]  T. D. Read,et al.  Role of Mobile DNA in the Evolution of Vancomycin-Resistant Enterococcus faecalis , 2003, Science.

[16]  V. Stalon,et al.  Control of enzyme synthesis in the oxalurate catabolic pathway of Streptococcus faecalis ATCC 11700: evidence for the existence of a third carbamate kinase , 1986, Archives of Microbiology.

[17]  C. Tricot,et al.  Evolutionary relationships among bacterial carbamoyltransferases. , 1989, Journal of general microbiology.

[18]  Hiroki Morizono,et al.  Crystal structure of a transcarbamylase-like protein from the anaerobic bacterium Bacteroides fragilis at 2.0 A resolution. , 2002, Journal of molecular biology.

[19]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[20]  N. Glansdorff,et al.  The Evolutionary History of Carbamoyltransferases: A Complex Set of Paralogous Genes Was Already Present in the Last Universal Common Ancestor , 1999, Journal of Molecular Evolution.

[21]  F. González-Candelas,et al.  Evolution of arginine deiminase (ADI) pathway genes. , 2002, Molecular phylogenetics and evolution.

[22]  J. Felsenstein Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. , 1996, Methods in enzymology.

[23]  M. Piotrowski,et al.  Identification and characterization of plant agmatine iminohydrolase, the last missing link in polyamine biosynthesis of plants , 2003, FEBS letters.

[24]  S. Brenner Errors in genome annotation. , 1999, Trends in genetics : TIG.

[25]  Y. Itoh,et al.  Identification of the putrescine biosynthetic genes in Pseudomonas aeruginosa and characterization of agmatine deiminase and N-carbamoylputrescine amidohydrolase of the arginine decarboxylase pathway. , 2003, Microbiology.

[26]  V. Stalon,et al.  Structure and properties of the putrescine carbamoyltransferase of Streptococcus faecalis. , 1979, European journal of biochemistry.

[27]  L. Grivell Mining the bibliome: searching for a needle in a haystack? , 2002, EMBO reports.

[28]  Gaston H. Gonnet,et al.  Darwin v. 2.0: an interpreted computer language for the biosciences , 2000, Bioinform..