New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees

Reconciliation between a set of gene trees and a species tree is the most commonly used approach to infer the duplication and loss events in the evolution of gene families, given a species tree. When a species tree is not known, a natural algorithmic problem is to infer a species tree such that the corresponding reconciliation minimizes the number of duplications and/or losses. In this paper, we clarify several theoretical questions and study various algorithmic issues related to these two problems. (1) For a given gene tree T and species tree S , we show that there is a single history explaining T and consistent with S that minimizes gene losses, and that this history also minimizes the number of duplications. We describe a simple linear-time and space algorithm to compute this parsimonious history, that is not based on the Lowest Common Ancestor (LCA) mapping approach; (2) We show that the problem of computing a species tree that minimizes the number of gene duplications, given a set of gene trees, is in fact a slight variant of a supertree problem; (3) We show that deciding if a set of gene trees can be explained using only apparent duplications can be done efficiently, as well as computing a parsimonious species tree for such gene trees. We also characterize gene trees that can be explained using only apparent duplications in terms of compatible triplets of leaves.

[1]  G. Moore,et al.  Fitting the gene lineage into its species lineage , 1979 .

[2]  Martin Vingron,et al.  Comparison of annotating duplication, tree mapping, and copying as methods to compare gene trees with species trees , 1996, Mathematical Hierarchies and Biology.

[3]  M. Sanderson,et al.  Inferring angiosperm phylogeny from EST data with widespread gene duplication , 2007, BMC Evolutionary Biology.

[4]  Patrick J Babin,et al.  Apolipocrustacein, formerly vitellogenin, is the major egg yolk precursor protein in decapod crustaceans and is homologous to insect apolipophorin II/I and vertebrate apolipoprotein B , 2007, BMC Evolutionary Biology.

[5]  D. Bryant Building trees, hunting for trees, and comparing trees : theory and methods in phylogenetic analysis , 1997 .

[6]  Mira V. Han,et al.  Gene Family Evolution across 12 Drosophila Genomes , 2007, PLoS genetics.

[7]  Tandy J. Warnow,et al.  Constructing a Tree from Homeomorphic Subtrees, with Applications to Computational Evolutionary Biology , 1996, SODA '96.

[8]  Roderic D. M. Page,et al.  GeneTree: comparing gene and species phylogenies using reconciled trees , 1998, Bioinform..

[9]  Nadia El-Mabrouk,et al.  Gene Family Evolution by Duplication, Speciation and Loss , 2022 .

[10]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[11]  E. Eichler,et al.  Structural Dynamics of Eukaryotic Chromosome Evolution , 2003, Science.

[12]  J. G. Burleigh,et al.  Heuristics for the Gene-duplication Problem : A Θ ( n ) Speed-up for the Local Search , 2007 .

[13]  F. McMorris,et al.  Mathematical Hierarchies and Biology , 1997 .

[14]  Oliver Eulenstein,et al.  Heuristics for the Gene-Duplication Problem: A Theta ( n ) Speed-Up for the Local Search , 2007, RECOMB.

[15]  Alfred V. Aho,et al.  Inferring a Tree from Lowest Common Ancestors with an Application to the Optimization of Relational Expressions , 1981, SIAM J. Comput..

[16]  Bengt Sennblad,et al.  Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution , 2004, RECOMB.

[17]  M. Steel The complexity of reconstructing trees from qualitative characters and subtrees , 1992 .

[18]  Dannie Durand,et al.  A Hybrid Micro-Macroevolutionary Approach to Gene Tree Reconstruction , 2005, RECOMB.

[19]  D. Sankoff,et al.  An efficient algorithm for supertrees , 1995 .

[20]  Jerzy Tiuryn,et al.  DLS-trees: A model of evolutionary scenarios , 2006, Theor. Comput. Sci..

[21]  Dannie Durand,et al.  NOTUNG: A Program for Dating Gene Duplications and Optimizing Gene Family Trees , 2000, J. Comput. Biol..

[22]  Cédric Chauve,et al.  Algorithms for Exploring the Space of Gene Tree/Species Tree Reconciliations , 2008, RECOMB-CG.

[23]  Steven Maere,et al.  The gain and loss of genes during 600 million years of vertebrate evolution , 2006, Genome Biology.

[24]  Sean R. Eddy,et al.  A simple algorithm to infer gene duplication and speciation events on a gene tree , 2001, Bioinform..

[25]  Michael A. Charleston,et al.  Reconciled trees and incongruent gene and species trees , 1996, Mathematical Hierarchies and Biology.

[26]  Michael T. Hallett,et al.  New algorithms for the duplication-loss model , 2000, RECOMB '00.

[27]  Roderic D. M. Page,et al.  Modified Mincut Supertrees , 2002, WABI.

[28]  Paola Bonizzoni,et al.  Reconciling a gene tree to a species tree under the duplication cost model , 2005, Theor. Comput. Sci..

[29]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.

[30]  Satish Rao,et al.  Using Max Cut to Enhance Rooted Trees Consistency , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[31]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[32]  N. Friedman,et al.  Natural history and evolutionary principles of gene duplication in fungi , 2007, Nature.

[33]  Charles Semple,et al.  A supertree method for rooted trees , 2000, Discret. Appl. Math..

[34]  R. Page Maps between trees and cladistic analysis of historical associations among genes , 1994 .

[35]  Louxin Zhang,et al.  On a Mirkin-Muchnik-Smith Conjecture for Comparing Molecular Phylogenies , 1997, J. Comput. Biol..

[36]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[37]  R. Page,et al.  Rates and patterns of gene duplication and loss in the human genome , 2005, Proceedings of the Royal Society B: Biological Sciences.

[38]  Jeffery P. Demuth,et al.  The Evolution of Mammalian Gene Families , 2006, PloS one.