Investigating 42 candidate orthologous protein groups by molecular evolutionary analysis on genome scale.

It is one of key problems for comparative genomics to accurately identify orthologous genes/proteins. Here 42 quartettes of human, yeast Saccharomyces cerevisiae, nematode Caenorhabditis elegans, and fruit fly Drosophila melanogaster candidate orthologs, defined by using similarity-based highest hit criteria (Mushegian et al., 1998 Genome Res. 8: 590-598), were reconsidered according to molecular evolutionary analysis. We found that only 14 of the 42 candidate orthologous groups can be identified to have truly one-to-one orthologous relationships, whereas other groups were characterized by one (many)-to-many orthologous relationships or even more complex scenarios involving gene duplications and/or gene losses. The result could imply that the classical one-to-one orthology might be not as common as typically accepted and automated similarity-based methods should be used with caution when accurate orthology/paralogy discrimination is required.

[1]  J. Thompson,et al.  Multiple sequence alignment with Clustal X. , 1998, Trends in biochemical sciences.

[2]  W. Fitch Distinguishing homologous from analogous proteins. , 1970, Systematic zoology.

[3]  J A Eisen,et al.  Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. , 1998, Genome research.

[4]  M S Boguski,et al.  Human and nematode orthologs--lessons from the analysis of 1800 human genes and the proteome of Caenorhabditis elegans. , 1999, Gene.

[5]  E. Abouheif,et al.  Evolution and orthology of hedgehog genes. , 1996, Trends in genetics : TIG.

[6]  C. Ouzounis Orthology: another terminology muddle. , 1999, Trends in genetics : TIG.

[7]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[8]  M. Boguski,et al.  Evolutionary parameters of the transcribed mammalian genome: an analysis of 2,820 orthologous rodent and human sequences. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[9]  J. Felsenstein Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. , 1996, Methods in enzymology.

[10]  Roderic D. M. Page,et al.  GeneTree: comparing gene and species phylogenies using reconciled trees , 1998, Bioinform..

[11]  André Goffeau,et al.  The yeast genome directory. , 1997, Nature.

[12]  Michael Y. Galperin,et al.  The COG database: a tool for genome-scale analysis of protein functions and evolution , 2000, Nucleic Acids Res..

[13]  Yan P. Yuan,et al.  Predicting function: from genes to genomes and back. , 1998, Journal of molecular biology.

[14]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[15]  R. Page,et al.  From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. , 1997, Molecular phylogenetics and evolution.

[16]  Leo X. Liu,et al.  Large-scale taxonomic profiling of eukaryotic model organisms: a comparison of orthologous proteins encoded by the human, fly, nematode, and yeast genomes. , 1998, Genome research.