Phylogeny of related functions: the case of polyamine biosynthetic enzymes.

Genome annotation requires explicit identification of gene function. This task frequently uses protein sequence alignments with examples having a known function. Genetic drift, co-evolution of subunits in protein complexes and a variety of other constraints interfere with the relevance of alignments. Using a specific class of proteins, it is shown that a simple data analysis approach can help solve some of the problems posed. The origin of ureohydrolases has been explored by comparing sequence similarity trees, maximizing amino acid alignment conservation. The trees separate agmatinases from arginases but suggest the presence of unknown biases responsible for unexpected positions of some enzymes. Using factorial correspondence analysis, a distance tree between sequences was established, comparing regions with gaps in the alignments. The gap tree gives a consistent picture of functional kinship, perhaps reflecting some aspects of phylogeny, with a clear domain of enzymes encoding two types of ureohydrolases (agmatinases and arginases) and activities related to, but different from ureohydrolases. Several annotated genes appeared to correspond to a wrong assignment if the trees were significant. They were cloned and their products expressed and identified biochemically. This substantiated the validity of the gap tree. Its organization suggests a very ancient origin of ureohydrolases. Some enzymes of eukaryotic origin are spread throughout the arginase part of the trees: they might have been derived from the genes found in the early symbiotic bacteria that became the organelles. They were transferred to the nucleus when symbiotic genes had to escape Muller's ratchet. This work also shows that arginases and agmatinases share the same two manganese-ion-binding sites and exhibit only subtle differences that can be accounted for knowing the three-dimensional structure of arginases. In the absence of explicit biochemical data, extreme caution is needed when annotating genes having similarities to ureohydrolases.

[1]  Burkhard Morgenstern,et al.  DIALIGN: finding local similarities by multiple sequence alignment , 1998, Bioinform..

[2]  C. Woese,et al.  Phylogenetic structure of the prokaryotic domain: The primary kingdoms , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[3]  D. Reis,et al.  Agmatine: An Endogenous Ligand at Imidazoline Receptors Is a Novel Neurotransmitter a , 1999, Annals of the New York Academy of Sciences.

[4]  Y. Diaz-Lazcoz,et al.  Evolution of genes, evolution of species: the case of aminoacyl-tRNA synthetases. , 1998, Molecular biology and evolution.

[5]  C. Yanofsky,et al.  Altered base ratios in the DNA of an Escherichia coli mutator strain. , 1967, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Radhey S. Gupta Protein Phylogenies and Signature Sequences: A Reappraisal of Evolutionary Relationships among Archaebacteria, Eubacteria, and Eukaryotes , 1998, Microbiology and Molecular Biology Reviews.

[7]  J. Haigh The accumulation of deleterious genes in a population--Muller's Ratchet. , 1978, Theoretical population biology.

[8]  C. Woese The universal ancestor. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[9]  P Guerdoux-Jamet,et al.  Mapping the bacterial cell architecture into the chromosome. , 2000, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[10]  N. Kyrpides,et al.  Archaeal translation initiation revisited: the initiation factor 2 and eukaryotic initiation factor 2B alpha-beta-delta subunit families. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[11]  A. Morineau,et al.  Multivariate descriptive statistical analysis , 1984 .

[12]  M. Gouy,et al.  Evolutionary distances between nucleotide sequences based on the distribution of substitution rates among sites as estimated by parsimony. , 1997, Molecular biology and evolution.

[13]  Michael Y. Galperin,et al.  Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea , 1997, Molecular microbiology.

[14]  S. Shinoda,et al.  Activities and Properties of Putrescine‐Biosynthetic Enzymes in Vibrio parahaemolyticus , 1988, Microbiology and immunology.

[15]  E. Koonin,et al.  Genomics: Re-evaluation of translation machinery evolution , 1998, Current Biology.

[16]  M. Van de Casteele,et al.  The role of the codon first letter in the relationship between genomic GC content and protein amino acid composition. , 1999, Research in microbiology.

[17]  Y. D. Cho,et al.  Purification of monomeric agmatine iminohydrolase from soybean. , 1991, Biochemical and biophysical research communications.

[18]  W. Maas,et al.  Isolation and Characterization of a Mutant of Escherichia coli Blocked in the Synthesis of Putrescine , 1970, Journal of bacteriology.

[19]  Siddhartha Roy Multifunctional enzymes and evolution of biosynthetic pathways: Retro‐evolution by jumps , 1999, Proteins.

[20]  James Lyons-Weiler,et al.  Escaping from the Felsenstein zone by detecting long branches in phylogenetic data. , 1997, Molecular phylogenetics and evolution.

[21]  C Ouzounis,et al.  The emergence of major cellular processes in evolution , 1996, FEBS letters.

[22]  Radhey S. Gupta What are archaebacteria: life's third domain or monoderm prokaryotes related to Gram‐positive bacteria? A new proposal for the classification of prokaryotic organisms , 1998, Molecular microbiology.

[23]  A. Danchin,et al.  Homeotopic transformation and the origin of translation. , 1989, Progress in biophysics and molecular biology.

[24]  J. Palmer,et al.  Animals and fungi are each other's closest relatives: congruent evidence from multiple proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[25]  R. Jensen Enzyme recruitment in evolution of new function. , 1976, Annual review of microbiology.

[26]  R. Sokal,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification. , 1975 .

[27]  D. Reis,et al.  Agmatinase Activity in Rat Brain: A Metabolic Pathway for the Degradation of Agmatine , 1996, Journal of neurochemistry.

[28]  Carl T. Bergstrom,et al.  Germline bottlenecks and the evolutionary maintenance of mitochondrial genomes. , 1998, Genetics.

[29]  E. Mayr Two empires or three? , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[30]  G. Fink,et al.  Nucleotide Sequence of Arabidopsis thaliana Arginase Expressed in Yeast , 1995, Plant physiology.

[31]  M. Hill Correspondence Analysis: A Neglected Multivariate Method , 1974 .

[32]  J. Perozich,et al.  Roles of conserved residues in the arginase family. , 1998, Biochimica et biophysica acta.

[33]  D. Reis,et al.  Metabolism of agmatine in macrophages: modulation by lipopolysaccharide and inhibitory cytokines. , 1998, The Biochemical journal.

[34]  C R Woese,et al.  A euryarchaeal lysyl-tRNA synthetase: resemblance to class I synthetases. , 1997, Science.

[35]  A. Danchin,et al.  Characterization of polyamine synthesis pathway in Bacillus subtilis 168 , 1998, Molecular microbiology.

[36]  J. Thompson,et al.  Using CLUSTAL for multiple sequence alignments. , 1996, Methods in enzymology.

[37]  Seymour S. Cohen A Guide to the Polyamines , 1998 .

[38]  J. Janin,et al.  Structures of escherichia coli CMP kinase alone and in complex with CDP: a new fold of the nucleoside monophosphate binding domain and insights into cytosine nucleotide specificity. , 1998, Structure.

[39]  Peter G. Foster,et al.  Compositional Bias May Affect Both DNA-Based and Protein-Based Phylogenetic Reconstructions , 1999, Journal of Molecular Evolution.

[40]  T. Geary,et al.  Reconstitution of a bacterial/plant polyamine biosynthesis pathway in Saccharomyces cerevisiae. , 1999, Microbiology.

[41]  G. B. Golding,et al.  The mosaic nature of the eukaryotic nucleus. , 1998, Molecular biology and evolution.

[42]  Roderic D. M. Page,et al.  TreeView: an application to display phylogenetic trees on personal computers , 1996, Comput. Appl. Biosci..