Models, algorithms and programs for phylogeny reconciliation

Gene sequences contain a gold mine of phylogenetic information. But unfortunately for taxonomists this information does not only tell the story of the species from which it was collected. Genes have their own complex histories which record speciation events, of course, but also many other events. Among them, gene duplications, transfers and losses are especially important to identify. These events are crucial to account for when reconstructing the history of species, and they play a fundamental role in the evolution of genomes, the diversification of organisms and the emergence of new cellular functions. We review reconciliations between gene and species trees, which are rigorous approaches for identifying duplications, transfers and losses that mark the evolution of a gene family. Existing reconciliation models and algorithms are reviewed and difficulties in modeling gene transfers are discussed. We also compare different reconciliation programs along with their advantages and disadvantages.

[1]  Erik L. L. Sonnhammer,et al.  Automated ortholog inference from phylogenetic trees and calculation of orthology reliability , 2002, Bioinform..

[2]  Michael A. Bender,et al.  The LCA Problem Revisited , 2000, LATIN.

[3]  Matthew W. Hahn,et al.  Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolution , 2007, Genome Biology.

[4]  G. Moore,et al.  Fitting the gene lineage into its species lineage , 1979 .

[5]  Temple F. Smith,et al.  Reconstruction of ancient molecular phylogeny. , 1996, Molecular phylogenetics and evolution.

[6]  Manolo Gouy,et al.  Detecting lateral gene transfers by statistical reconciliation of phylogenetic forests , 2010, BMC Bioinformatics.

[7]  Daniel Merkle,et al.  Reconstruction of the cophylogenetic history of related phylogenetic trees with divergence timing information , 2005, Theory in Biosciences.

[8]  Jerzy Tiuryn,et al.  DLS-trees: A model of evolutionary scenarios , 2006, Theor. Comput. Sci..

[9]  Michael T. Hallett,et al.  Simultaneous Identification of Duplications and Lateral Gene Transfers , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Ran Libeskind-Hadas,et al.  Jane: a new tool for the cophylogeny reconstruction problem , 2010, Algorithms for Molecular Biology.

[11]  Todd H. Oakley,et al.  Gene duplication and the origins of morphological complexity in pancrustacean eyes, a genomic approach , 2010, BMC Evolutionary Biology.

[12]  N. Lartillot,et al.  A phylogenetic model for investigating correlated evolution of substitution rates and continuous phenotypic characters. , 2011, Molecular biology and evolution.

[13]  R. Page Maps between trees and cladistic analysis of historical associations among genes , 1994 .

[14]  Oliver Eulenstein,et al.  The multiple gene duplication problem revisited , 2008, ISMB.

[15]  Nadia El-Mabrouk,et al.  New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees , 2009, RECOMB.

[16]  Berend Snel,et al.  Keeping Afloat: A Strategy for Small Island Nations , 2005, BMC Bioinformatics.

[17]  Oliver Eulenstein,et al.  Maximum likelihood models and algorithms for gene tree evolution with duplications and losses , 2011, BMC Bioinformatics.

[18]  Ran Libeskind-Hadas,et al.  On the Computational Complexity of the Reticulate Cophylogeny Reconstruction Problem , 2009, J. Comput. Biol..

[19]  Manolis Kellis,et al.  A Bayesian Approach for Fast and Accurate Gene Tree Reconstruction , 2010, Molecular biology and evolution.

[20]  Vincent Berry,et al.  An Efficient Algorithm for Gene/Species Trees Parsimonious Reconciliation with Losses, Duplications and Transfers , 2010, RECOMB-CG.

[21]  Avi Pfeffer,et al.  Automatic genome-wide reconstruction of phylogenetic gene trees , 2007, ISMB/ECCB.

[22]  Daniel R. Brooks,et al.  The historical biogeography of co‐evolution: emerging infectious diseases are evolutionary accidents waiting to happen , 2005 .

[23]  Cédric Chauve,et al.  An Efficient Method for Exploring the Space of Gene Tree/Species Tree Reconciliations in a Probabilistic Framework , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  M. Charleston,et al.  Jungles: a new solution to the host/parasite phylogeny reconciliation problem. , 1998, Mathematical biosciences.

[25]  Sean R. Eddy,et al.  A simple algorithm to infer gene duplication and speciation events on a gene tree , 2001, Bioinform..

[26]  Oliver Eulenstein,et al.  Reconciling Gene Trees with Apparent Polytomies , 2006, COCOON.

[27]  Michael J. Sanderson,et al.  R8s: Inferring Absolute Rates of Molecular Evolution, Divergence times in the Absence of a Molecular Clock , 2003, Bioinform..

[28]  Michael A. Charleston,et al.  Traversing the tangle: Algorithms and applications for cophylogenetic studies , 2006, J. Biomed. Informatics.

[29]  Sudhir Kumar,et al.  Molecular clocks: four decades of evolution , 2005, Nature Reviews Genetics.

[30]  Bengt Sennblad,et al.  Bayesian gene/species tree reconciliation and orthology analysis using MCMC , 2003, ISMB.

[31]  Louxin Zhang,et al.  On a Mirkin-Muchnik-Smith Conjecture for Comparing Molecular Phylogenies , 1997, J. Comput. Biol..

[32]  Cédric Chauve,et al.  Space of Gene/Species Trees Reconciliations and Parsimonious Models , 2009, J. Comput. Biol..

[33]  Luay Nakhleh,et al.  RIATA-HGT: A Fast and Accurate Heuristic for Reconstructing Horizontal Gene Transfer , 2005, COCOON.

[34]  E. Rocha,et al.  Horizontal Transfer, Not Duplication, Drives the Expansion of Protein Families in Prokaryotes , 2011, PLoS genetics.

[35]  N. Friedman,et al.  Natural history and evolutionary principles of gene duplication in fungi , 2007, Nature.

[36]  L. Koski,et al.  The Closest BLAST Hit Is Often Not the Nearest Neighbor , 2001, Journal of Molecular Evolution.

[37]  Christophe Dessimoz,et al.  Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods , 2009, PLoS Comput. Biol..

[38]  Nello Cristianini,et al.  CAFE: a computational tool for the study of gene family evolution , 2006, Bioinform..

[39]  Caroline Nieberding,et al.  The use of co-phylogeographic patterns to predict the nature of host-parasite interactions, and vice versa , 2010 .

[40]  Eugene V. Koonin,et al.  Biological applications of the theory of birth-and-death processes , 2005, Briefings Bioinform..

[41]  Ali Tofigh,et al.  Using Trees to Capture Reticulate Evolution : Lateral Gene Transfers and Cancer Progression , 2009 .

[42]  Ilya B. Muchnik,et al.  A Biologically Consistent Model for Comparing Molecular Phylogenies , 1995, J. Comput. Biol..

[43]  Kousha Etessami,et al.  Recursive Markov chains, stochastic grammars, and monotone systems of nonlinear equations , 2005, JACM.

[44]  Nicholas Hamilton,et al.  Phylogenetic identification of lateral genetic transfer events , 2006, BMC Evolutionary Biology.

[45]  Dannie Durand,et al.  A Hybrid Micro-Macroevolutionary Approach to Gene Tree Reconstruction , 2005, RECOMB.

[46]  Bengt Sennblad,et al.  Birth-death prior on phylogeny and speed dating , 2008, BMC Evolutionary Biology.

[47]  Nicolas Lartillot,et al.  PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating , 2009, Bioinform..

[48]  Bengt Sennblad,et al.  Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution , 2004, RECOMB.

[49]  István Miklós,et al.  Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model , 2009, Molecular biology and evolution.

[50]  Guy Perrière,et al.  Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases , 2005, Bioinform..

[51]  Noah A Rosenberg,et al.  Gene tree discordance, phylogenetic inference and the multispecies coalescent. , 2009, Trends in ecology & evolution.

[52]  B Vernot,et al.  Reconciliation with Non-Binary Species Trees , 2007, J. Comput. Biol..

[53]  Paola Bonizzoni,et al.  Reconciling a gene tree to a species tree under the duplication cost model , 2005, Theor. Comput. Sci..

[54]  Bengt Sennblad,et al.  The gene evolution model and computing its associated probabilities , 2009, JACM.

[55]  M. Sanderson,et al.  Inferring angiosperm phylogeny from EST data with widespread gene duplication , 2007, BMC Evolutionary Biology.

[56]  B. Boussau,et al.  Genomes as documents of evolutionary history. , 2010, Trends in ecology & evolution.

[57]  Matthew J. Betts,et al.  Optimal Gene Trees from Sequences and Species Trees Using a Soft Interpretation of Parsimony , 2006, Journal of Molecular Evolution.

[58]  Lawrence A. David,et al.  Rapid evolutionary innovation during an Archaean genetic expansion , 2011, Nature.

[59]  Matthew D. Rasmussen,et al.  Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes. , 2007, Genome research.

[60]  J. Lagergren,et al.  Simultaneous Bayesian gene tree reconstruction and reconciliation analysis , 2009, Proceedings of the National Academy of Sciences.

[61]  Daniel Merkle,et al.  A parameter-adaptive dynamic programming approach for inferring cophylogenies , 2010, BMC Bioinformatics.

[62]  Sean R. Eddy,et al.  RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs , 2002, BMC Bioinformatics.

[63]  J. Lagergren,et al.  Probabilistic orthology analysis. , 2009, Systematic biology.

[64]  R. Page,et al.  Trees within trees: phylogeny and historical associations. , 1998, Trends in ecology & evolution.

[65]  Ran Libeskind-Hadas,et al.  The Cophylogeny Reconstruction Problem Is NP-Complete , 2011, J. Comput. Biol..