Gene Tree Correction by Leaf Removal and Modification: Tractability and Approximability

The reconciliation of a gene tree and a species tree is a well-known method to understand the evolution of a gene family in order to identify which evolutionary events (speciations, duplications and losses) occurred during gene evolution. Since reconciliation is usually affected by errors in the gene trees, they have to be preprocessed before the reconciliation process. A method to tackle with this problem aims to correct a gene tree by removing the minimum number of leaves (Minimum Leaf Removal). In this paper we show that Minimum Leaf Removal is not approximable within factor b logm, where m is the number of leaves of the species tree and b > 0 is a constant. Furthermore, we introduce a new variant of the problem, where the goal is the correction of a gene tree with the minimum number of leaf modifications. We show that this problem, differently from the removal version, is W[1]-hard, when parameterized by the number of leaf modifications.

[1]  Ran Raz,et al.  A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP , 1997, STOC '97.

[2]  Dannie Durand,et al.  Reconciliation with non-binary species trees. , 2008, Journal of computational biology : a journal of computational molecular cell biology.

[3]  Michael R. Fellows,et al.  Fixed-Parameter Tractability and Completeness II: On Completeness for W[1] , 1995, Theor. Comput. Sci..

[4]  Valentine Kabanets,et al.  Correlation Bounds and #SAT Algorithms for Small Linear-Size Circuits , 2015, COCOON.

[5]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[6]  Paola Bonizzoni,et al.  Complexity Insights of the Minimum Duplication Problem , 2012, SOFSEM.

[7]  S. Ohno Evolution by Gene Duplication , 1971 .

[8]  Dannie Durand,et al.  A hybrid micro-macroevolutionary approach to gene tree reconstruction. , 2006 .

[9]  Matthew W. Hahn,et al.  Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolution , 2007, Genome Biology.

[10]  E. Eichler,et al.  Structural Dynamics of Eukaryotic Chromosome Evolution , 2003, Science.

[11]  Krister M. Swenson,et al.  Gene tree correction for reconciliation and species tree inference , 2012, Algorithms for Molecular Biology.

[12]  Nadia El-Mabrouk,et al.  New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees , 2009, RECOMB.

[13]  M. Sanderson,et al.  Inferring angiosperm phylogeny from EST data with widespread gene duplication , 2007, BMC Evolutionary Biology.

[14]  Oliver Eulenstein,et al.  Reconciling Gene Trees with Apparent Polytomies , 2006, COCOON.

[15]  Riccardo Dondi,et al.  Gene tree correction for reconciliation and species tree inference , 2012, Algorithms for Molecular Biology.

[16]  Roderic D. M. Page,et al.  Vertebrate Phylogenomics: Reconciled Trees and Gene Duplications , 2001, Pacific Symposium on Biocomputing.

[17]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[18]  Roderic D. M. Page,et al.  GeneTree: comparing gene and species phylogenies using reconciled trees , 1998, Bioinform..

[19]  Paola Bonizzoni,et al.  Reconciling a gene tree to a species tree under the duplication cost model , 2005, Theor. Comput. Sci..