The MinMax Squeeze: guaranteeing a minimal tree for population data.

We report that for population data, where sequences are very similar to one another, it is often possible to use a two-pronged (MinMax Squeeze) approach to prove that a tree is the shortest possible under the parsimony criterion. Such population data can be in a range where parsimony is a maximum likelihood estimator. This is in sharp contrast to the case with species data, where sequences are much further apart and the problem of guaranteeing an optimal phylogenetic tree is known to be computationally prohibitive for realistic numbers of species, irrespective of whether likelihood or parsimony is the optimality criterion. The Squeeze uses both an upper bound (the length of the shortest tree known) and a lower bound derived from partitions of the columns (the length of the shortest tree possible). If the two bounds meet, the shortest known tree is thus proven to be a shortest possible tree. The implementation is first tested on simulated data sets and then applied to 53 complete human mitochondrial genomes. The shortest possible trees for those data have several significant improvements from the published tree. Namely, a pair of Australian lineages comes deeper in the tree (in agreement with archaeological data), and the non-African part of the tree shows greater agreement with the geographical distribution of lineages.

[1]  H. Bandelt,et al.  Paleolithic and neolithic lineages in the European mitochondrial gene pool. , 1996, American journal of human genetics.

[2]  David Fernández-Baca,et al.  A Polynomial-Time Algorithm for Near-Perfect Phylogeny , 1996, SIAM J. Comput..

[3]  P. Lewis A likelihood approach to estimating phylogeny from discrete morphological character data. , 2001, Systematic biology.

[4]  R. Graham,et al.  Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time , 1982 .

[5]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[6]  G. Laslett,et al.  New Ages for the Last Australian Megafauna: Continent-Wide Extinction About 46,000 Years Ago , 2001, Science.

[7]  K. Hawkes,et al.  African populations and the evolution of human mitochondrial DNA. , 1991, Science.

[8]  C. Reeves Modern heuristic techniques for combinatorial problems , 1993 .

[9]  Alessandro Panconesi,et al.  Ancestral Maximum Likelihood of Evolutionary Trees Is Hard , 2003, WABI.

[10]  F. Glover,et al.  In Modern Heuristic Techniques for Combinatorial Problems , 1993 .

[11]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[12]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[13]  Vincent Moulton,et al.  Consensus Networks: A Method for Visualising Incompatibilities in Collections of Trees , 2003, WABI.

[14]  Victor A. Albert,et al.  Parsimony, phylogeny, and genomics , 2006 .

[15]  S. Bedford ON THE ROAD OF THE WINDS: AN ARCHAEOLOGICAL HISTORY OF THE PACIFIC ISLANDS BEFORE EUROPEAN CONTACT, by , 2001 .

[16]  L. Foulds,et al.  Proving phylogenetic trees minimal with l-clustering and set partitioning , 1980 .

[17]  Hans-Jürgen Bandelt,et al.  The emerging limbs and twigs of the East Asian mtDNA tree. , 2002, Molecular biology and evolution.

[18]  D Penny,et al.  Minimally colored trees. , 1990, Mathematical biosciences.

[19]  Mike A. Steel,et al.  Two further links between MP and ML under the poisson model , 2004, Appl. Math. Lett..

[20]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[21]  D Penny,et al.  Parsimony, likelihood, and the role of models in molecular phylogenetics. , 2000, Molecular biology and evolution.

[22]  A. Rodrigo,et al.  Measurably evolving populations , 2003 .

[23]  D. Penny,et al.  Branch and bound algorithms to determine minimal evolutionary trees , 1982 .

[24]  M. Stoneking,et al.  Mitochondrial DNA and human evolution , 1987, Nature.

[25]  Michael P. Cummings,et al.  PAUP* [Phylogenetic Analysis Using Parsimony (and Other Methods)] , 2004 .

[26]  L. Cavalli-Sforza,et al.  Paleolithic and Neolithic lineages in the European mitochondrial gene pool. , 1997, American journal of human genetics.

[27]  S. Jeffery Evolution of Protein Molecules , 1979 .

[28]  D Penny,et al.  Testing migration patterns and estimating founding population size in Polynesia by using human mtDNA sequences. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[29]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[30]  D. Turnbull,et al.  Reduced-median-network analysis of complete mitochondrial DNA coding-region sequences for the major African, Asian, and European haplogroups. , 2002, American journal of human genetics.

[31]  S. Pääbo,et al.  Mitochondrial genome variation and the origin of modern humans , 2000, Nature.