Phylogenetics iHam & pyHam : visualizing and processing hierarchical orthologous groups

Summary: The evolutionary history of gene families can be complex due to duplications and losses. This complexity is compounded by the large number of genomes simultaneously considered in contemporary comparative genomic analyses. As provided by several orthology databases, hierarchical orthologous groups (HOGs) are sets of genes that are inferred to have descended from a common ancestral gene within a species clade. This implies that the set of HOGs defined for a particular clade correspond to the ancestral genes found in its last common ancestor. Furthermore, by keeping track of HOG composition along the species tree, it is possible to infer the emergence, duplications and losses of genes within a gene family of interest. However, the lack of tools to manipulate and analyse HOGs has made it difficult to extract, display, and interpret this type of information. To address this, we introduce iHam, an interactive JavaScript widget to visualise and explore gene family history encoded in HOGs, and pyHam, a python library for programmatic processing of genes families. These complementary open source tools greatly ease adoption of HOGs as a scalable and interpretable concept to relate genes across multiple species. Availability and implementation: iHam’s code is available at https://github.com/DessimozLab/iHam or can be loaded dynamically. pyHam’s code is available at https://github.com/DessimozLab/pyHam and or via the pip package “pyham”. Contact: Christophe.Dessimoz@unil.ch

[1]  W. Fitch Distinguishing homologous from analogous proteins. , 1970, Systematic zoology.

[2]  Guy Perrière,et al.  Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases , 2005, Bioinform..

[3]  Joaquín Dopazo,et al.  ETE: a python Environment for Tree Exploration , 2010, BMC Bioinformatics.

[4]  Fabian Schreiber,et al.  Letter to the Editor: SeqXML and OrthoXML: standards for sequence and orthology information , 2011, Briefings Bioinform..

[5]  Salvador Capella-Gutiérrez,et al.  PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome , 2013, Nucleic Acids Res..

[6]  Miguel Pignatelli TnT: a set of libraries for visualizing trees and track-based annotations for the web , 2016, Bioinform..

[7]  Davide Heller,et al.  eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences , 2015, Nucleic Acids Res..

[8]  Albert J. Vilella,et al.  Ensembl comparative genomics resources , 2016, Database : the journal of biological databases and curation.

[9]  Evgeny M. Zdobnov,et al.  OrthoDB v9.1: cataloging evolutionary and functional annotations for animal, fungal, plant, archaeal, bacterial and viral orthologs , 2016, Nucleic Acids Res..

[10]  Mateusz Kaduk,et al.  HieranoiDB: a database of orthologs inferred by Hieranoid , 2016, Nucleic Acids Res..

[11]  Gaston H. Gonnet,et al.  The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces , 2017, Nucleic Acids Res..