Optimizing phylogenetic diversity under constraints.

Phylogenetic diversity (PD) is a measure of the extent to which different subsets of taxa span an evolutionary tree, and provides a quantitative tool for studying biodiversity conservation. Recently, it was shown that the problem of finding subsets of taxa of given size to maximize PD can be efficiently solved by a greedy algorithm. In this paper, we extend this earlier work, beginning with a more explicit description of the underlying combinatorial structure of the problem and its connection to greedoid theory. Next we show that an extension of the PD optimization problem to a phylogeographic setting is NP-hard, although a special case has a polynomial-time solution based on the greedy algorithm. We also show how the greedy algorithm can be used to solve some special cases of the PD optimization problem when the sets that are restricted to are ecologically 'viable'. Finally, we show that three measures related to PD fail to be optimized by a greedy algorithm.

[1]  D. Faith,et al.  Phylogenetic diversity (PD) and biodiversity conservation: some bioinformatics challenges , 2006, Evolutionary bioinformatics online.

[2]  P. Lewis,et al.  Unearthing the molecular phylodiversity of desert soil green algae (Chlorophyta). , 2005, Systematic biology.

[3]  D. Huson,et al.  Application of phylogenetic networks in evolutionary studies. , 2006, Molecular biology and evolution.

[4]  M. Garey Johnson: computers and intractability: a guide to the theory of np- completeness (freeman , 1979 .

[5]  Mike Steel,et al.  Phylogenetic diversity and the greedy algorithm. , 2005, Systematic biology.

[6]  Peter A. Brooksbank,et al.  Greedy Algorithm compatibility and heavy-set structures , 1992, Eur. J. Comb..

[7]  G. Barker Phylogenetic diversity: a quantitative framework for measurement of priority and achievement in biodiversity conservation , 2002 .

[8]  D. Kendall,et al.  Mathematics in the Archaeological and Historical Sciences , 1971, The Mathematical Gazette.

[9]  Andy Purvis,et al.  Phylogeny and Conservation , 2009 .

[10]  Nick Goldman,et al.  Species Choice for Comparative Genomics: Being Greedy Works , 2005, PLoS genetics.

[11]  Oskar Goecke,et al.  A greedy algorithm for hereditary set systems and a generalization of the Rado-Edmonds characterization of matroids , 1988, Discret. Appl. Math..

[12]  P. H. A. Sneath Mathematics in the Archaeological and Historical Sciences , 1972 .

[13]  Barbara R. Holland,et al.  Evolutionary analyses of large data sets: Trees and beyond , 2001 .

[14]  Günter M. Ziegler,et al.  Matroid Applications: Introduction to Greedoids , 1992 .

[15]  Bui Quang Minh,et al.  Phylogenetic diversity within seconds. , 2006, Systematic biology.

[16]  P. Buneman The Recovery of Trees from Measures of Dissimilarity , 1971 .

[17]  D. Faith Conservation evaluation and phylogenetic diversity , 1992 .

[18]  M. Steel,et al.  Phylogenetic diversity: from combinatorics to ecology , 2007 .

[19]  A. Dress,et al.  A canonical decomposition theory for metrics on a finite set , 1992 .

[20]  V. Loeschcke,et al.  Modelling the optimal conservation of interacting species , 2000 .

[21]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[22]  J. Bergh,et al.  Extending Weitzman's economic ranking of biodiversity protection: combining ecological and genetic considerations , 2005 .

[23]  Kevin J. Gaston,et al.  Phylogeny and Conservation: Integrating phylogenetic diversity in the selection of priority areas for conservation: does it make a difference? , 2005 .

[24]  Mike Steel,et al.  Phylogenetic Diversity Over an Abelian Group , 2007 .