Complexity Insights of the Minimum Duplication Problem

The Minimum Duplication problem is a well-known problem in phylogenetics and comparative genomics. Given a set of gene trees, the Minimum Duplication problem asks for a species tree that induces the minimum number of gene duplications in the input gene trees. More recently, a variant of the Minimum Duplication problem, called Minimum Duplication Bipartite, has been introduced in [14], where the goal is to find all pre-duplications , that is duplications that precede, in the evolution, the first speciation with respect to a species tree. In this paper, we investigate the complexity of both Minimum Duplication and Minimum Duplication Bipartite problems. First of all, we prove that the Minimum Duplication problem is APX-hard, even when the input consists of five uniquely leaf-labelled gene trees (progressing on the complexity of the problem). Then, we show that the Minimum Duplication Bipartite problem can be solved efficiently by a randomized algorithm when the input gene trees have bounded depth.

[1]  Éva Tardos,et al.  Algorithm design , 2005 .

[2]  Roderic D. M. Page,et al.  GeneTree: comparing gene and species phylogenies using reconciled trees , 1998, Bioinform..

[3]  Viggo Kann,et al.  Some APX-completeness results for cubic graphs , 2000, Theor. Comput. Sci..

[4]  E. Eichler,et al.  Structural Dynamics of Eukaryotic Chromosome Evolution , 2003, Science.

[5]  Oliver Eulenstein,et al.  Heuristics for the Gene-Duplication Problem: A Theta ( n ) Speed-Up for the Local Search , 2007, RECOMB.

[6]  Manolis Kellis,et al.  Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss , 2012, Bioinform..

[7]  Ron Shamir,et al.  A Note on the Fixed Parameter Tractability of the Gene-Duplication Problem , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  R. Page,et al.  Rates and patterns of gene duplication and loss in the human genome , 2005, Proceedings of the Royal Society B: Biological Sciences.

[9]  Vincent Berry,et al.  An Efficient Algorithm for Gene/Species Trees Parsimonious Reconciliation with Losses, Duplications and Transfers , 2010, RECOMB-CG.

[10]  Oliver Eulenstein,et al.  The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI-Based Local Searches , 2009, TCBB.

[11]  Ulrike Stege,et al.  Gene Trees and Species Trees: The Gene-Duplication Problem in Fixed-Parameter Tractable , 1999, WADS.

[12]  Jaroslaw Byrka,et al.  New Results on Optimizing Rooted Triplets Consistency , 2008, ISAAC.

[13]  Oliver Eulenstein,et al.  Reconciling Gene Trees with Apparent Polytomies , 2006, COCOON.

[14]  R Clay Reid,et al.  Materials and Methods Som Text Figs. S1 to S7 References Movies S1 to S7 Role of Subplate Neurons in Functional Maturation of Visual Cortical Columns , 2022 .

[15]  David Fernández-Baca,et al.  An ILP solution for the gene duplication problem , 2011, BMC Bioinformatics.

[16]  Nadia El-Mabrouk,et al.  New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees , 2009, RECOMB.

[17]  Michael T. Hallett,et al.  New algorithms for the duplication-loss model , 2000, RECOMB '00.

[18]  Bengt Sennblad,et al.  Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution , 2004, RECOMB.

[19]  Michael T. Hallett,et al.  Simultaneous identification of duplications and lateral transfers , 2004, RECOMB.

[20]  Krister M. Swenson,et al.  An Approximation Algorithm for Computing a Parsimonious First Speciation in the Gene Duplication Model , 2010, RECOMB-CG.

[21]  Bengt Sennblad,et al.  The gene evolution model and computing its associated probabilities , 2009, JACM.

[22]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.

[23]  W. Fitch Homology a personal view on some of the problems. , 2000, Trends in genetics : TIG.

[24]  Michael T. Hallett,et al.  Simultaneous Identification of Duplications and Lateral Gene Transfers , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[25]  Niklaus Wirth,et al.  Algorithms and Data Structures , 1989, Lecture Notes in Computer Science.

[26]  Riccardo Dondi,et al.  Resolving Rooted Triplet Inconsistency by Dissolving Multigraphs , 2013, TAMC.

[27]  Dannie Durand,et al.  A hybrid micro-macroevolutionary approach to gene tree reconstruction. , 2006 .

[28]  Manolis Kellis,et al.  Reconciliation Revisited: Handling Multiple Optima when Reconciling with Duplication, Transfer, and Loss , 2013, J. Comput. Biol..

[29]  Roderic D. M. Page,et al.  Vertebrate Phylogenomics: Reconciled Trees and Gene Duplications , 2001, Pacific Symposium on Biocomputing.

[30]  D. J. A. Welsh,et al.  An upper bound for the chromatic number of a graph and its application to timetabling problems , 1967, Comput. J..

[31]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[32]  Paola Bonizzoni,et al.  Reconciling a gene tree to a species tree under the duplication cost model , 2005, Theor. Comput. Sci..