OrthoDB: the hierarchical catalog of eukaryotic orthologs

The concept of orthology is widely used to relate genes across different species using comparative genomics, and it provides the basis for inferring gene function. Here we present the web accessible OrthoDB database that catalogs groups of orthologous genes in a hierarchical manner, at each radiation of the species phylogeny, from more general groups to more fine-grained delineations between closely related species. We used a COG-like and Inparanoid-like ortholog delineation procedure on the basis of all-against-all Smith-Waterman sequence comparisons to analyze 58 eukaryotic genomes, focusing on vertebrates, insects and fungi to facilitate further comparative studies. The database is freely available at http://cegg.unige.ch/orthodb

[1]  E. Koonin Orthologs, Paralogs, and Evolutionary Genomics 1 , 2005 .

[2]  M. O. Dayhoff,et al.  The origin and evolution of protein superfamilies. , 1976, Federation proceedings.

[3]  A. Mironov,et al.  PHOG-BLAST – a new generation tool for fast similarity search of protein families , 2006, BMC Evolutionary Biology.

[4]  C. Moran,et al.  Effects of amino acid substitutions in the -10 binding region of sigma E from Bacillus subtilis , 1992, Journal of bacteriology.

[5]  M. Ruggero,et al.  Similarity of Traveling-Wave Delays in the Hearing Organs of Humans and Other Tetrapods , 2007, Journal for the Association for Research in Otolaryngology.

[6]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[7]  E. Koonin Orthologs, paralogs, and evolutionary genomics. , 2005, Annual review of genetics.

[8]  Erik L. L. Sonnhammer,et al.  Inparanoid: a comprehensive database of eukaryotic orthologs , 2004, Nucleic Acids Res..

[9]  P. Bork,et al.  Quantification of insect genome divergence. , 2007, Trends in genetics : TIG.

[10]  Dayhoff Mo,et al.  The origin and evolution of protein superfamilies. , 1976 .

[11]  Olivier Gascuel,et al.  PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference , 2018 .

[12]  Peer Bork,et al.  Comparative Genome and Proteome Analysis of Anopheles gambiae and Drosophila melanogaster , 2002, Science.

[13]  Berend Snel,et al.  Keeping Afloat: A Strategy for Small Island Nations , 2005, BMC Bioinformatics.

[14]  Feng Chen,et al.  OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups , 2005, Nucleic Acids Res..

[15]  Ying Wang,et al.  Insights into social insects from the genome of the honeybee Apis mellifera , 2006, Nature.

[16]  W. Fitch Distinguishing homologous from analogous proteins. , 1970, Systematic zoology.

[17]  Teresa M. Przytycka,et al.  COCO-CL: hierarchical clustering of homology relations based on evolutionary correlations , 2006, Bioinform..

[18]  Wei Qian,et al.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. , 2000, Molecular biology and evolution.

[19]  Robert M. Waterhouse,et al.  Evolutionary Dynamics of Immune-Related Genes and Pathways in Disease-Vector Mosquitoes , 2007, Science.

[20]  E. Koonin,et al.  Orthology, paralogy and proposed classification for paralog subtypes. , 2002, Trends in genetics : TIG.

[21]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[22]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Torbjørn Rognes,et al.  PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology , 2005, Nucleic Acids Res..

[24]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[25]  A. Mironov,et al.  PHOG: a database of supergenomes built from proteome complements , 2006, BMC Evolutionary Biology.

[26]  Stefan Wyder,et al.  Quantification of ortholog losses in insects and vertebrates , 2007, Genome Biology.

[27]  Andreas Prlic,et al.  Ensembl 2007 , 2006, Nucleic Acids Res..

[28]  M. Gouy,et al.  HOVERGEN: a database of homologous vertebrate genes. , 1994, Nucleic acids research.