New algorithms for the duplication-loss model

We consider the problem of constructing a species tree given a number of gene trees. In the frameworks introduced by Goodman et al. [3], Page [10], and Guigó, Muchnik, and Smith [5] this is formulated as an optimization problem; namely, that of finding the species tree requiring the minimum number of duplications and/ or losses in order to explain the gene trees. In this paper, we introduce the WIDTH k DUPLICATION-LOSS and WIDTH k DUPLICATION problems. A gene tree has width k w.r.t. a species tree, if the species tree can be reconciled with the gene tree using at most k simultaneously active copies of the gene along its branches. We explain w.r.t. to the underlying biological model, why this width is typically very small in comparison to the total number of duplications and losses. We show polynomial time algorithms for finding optimal species trees having bounded width w.r.t. at least one of the input gene trees. Furthermore, we present the first algorithm for input gene trees that are unrooted. Lastly, we apply our algorithms to a dataset from [5] and show a species tree requiring significantly fewer duplications and fewer duplications/losses than the trees given in the original paper.

[1]  D. Lipman,et al.  A genomic perspective on protein families. , 1997, Science.

[2]  Yan P. Yuan,et al.  Predicting function: from genes to genomes and back. , 1998, Journal of molecular biology.

[3]  Roderic D. M. Page,et al.  GeneTree: comparing gene and species phylogenies using reconciled trees , 1998, Bioinform..

[4]  Ilya B. Muchnik,et al.  A Biologically Consistent Model for Comparing Molecular Phylogenies , 1995, J. Comput. Biol..

[5]  Michael R. Fellows,et al.  Analogs and Duals of the MAST Problem for Sequences and Trees , 1998, ESA.

[6]  Martin Vingron,et al.  Towards detection of orthologues in sequence databases , 1998, Bioinform..

[7]  Michael Y. Galperin,et al.  Beyond complete genomes: from sequence to structure and function. , 1998, Current opinion in structural biology.

[8]  Temple F. Smith,et al.  Reconstruction of ancient molecular phylogeny. , 1996, Molecular phylogenetics and evolution.

[9]  R. Page,et al.  From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. , 1997, Molecular phylogenetics and evolution.

[10]  G. Moore,et al.  Fitting the gene lineage into its species lineage , 1979 .

[11]  Bin Ma,et al.  On reconstructing species trees from gene trees in term of duplications and losses , 1998, RECOMB '98.

[12]  Gene trees and species trees the gene duplication problem is fixed-parameter , .

[13]  R. Page Maps between trees and cladistic analysis of historical associations among genes , 1994 .

[14]  Michael A. Charleston,et al.  Reconciled trees and incongruent gene and species trees , 1996, Mathematical Hierarchies and Biology.