Parsimony, likelihood, and simplicity

The latest charge against parsimony in phylogenetic inference is that it involves estimating too many parameters. The charge is derived from the fact that, when each character is allowed a branch length vector of its own (instead of the homogeneous branch lengths assumed in current likelihood models), the results for likelihood and parsimony are identical. Parsimony, however, can also be derived from simpler models, involving fewer parameters. Therefore, parsimony provides (as many authors had argued before) the simplest explanation of the data, or the most realistic, depending on one's views. If (as argued by likelihoodists) phylogenetic inference is to use the simplest model that provides sufficient explanation of the data, the starting point of phylogenetic analyses should be parsimony, not maximum likelihood. If the addition of new parameters (which increase the likelihood) to a parsimony estimation is seen as desirable, this may lead to a preference for results based on current likelihood models. If the addition of parameters is continued, however, the results will eventually come back to the same place where they had started, since allowing each character a branch length of its own also produces parsimony. Parsimony can be justified by very different types of models—either very complex or very simple. This suggests that parsimony does have a unique place among methods of phylogenetic estimation.

[1]  M. Siddall Philosophy and Phylogenetic Inference: A Comparison of Likelihood and Parsimony Methods in the Context of Karl Popper's Writings on Corroboration , 2001 .

[2]  P. Lewis A likelihood approach to estimating phylogeny from discrete morphological character data. , 2001, Systematic biology.

[3]  D. Pol,et al.  Biases in Maximum Likelihood and Parsimony: A Simulation Approach to a 10‐Taxon Case , 2001, Cladistics : the international journal of the Willi Hennig Society.

[4]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[5]  J. S. Rogers,et al.  Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. , 2001, Systematic biology.

[6]  A. Kluge,et al.  Philosophical conjectures and their refutation. , 2001, Systematic biology.

[7]  K. de Queiroz,et al.  Philosophy and phylogenetic inference: a comparison of likelihood and parsimony methods in the context of Karl Popper's writings on corroboration. , 2001, Systematic biology.

[8]  J. Farris Corroboration versus “Strongest Evidence” , 2000, Cladistics : the international journal of the Willi Hennig Society.

[9]  D Penny,et al.  Parsimony, likelihood, and the role of models in molecular phylogenetics. , 2000, Molecular biology and evolution.

[10]  P. Goloboff Analyzing Large Data Sets in Reasonable Times: Solutions for Composite Optima , 1999, Cladistics : the international journal of the Willi Hennig Society.

[11]  J. Farris Likelihood and Inconsistency , 1999, Cladistics : the international journal of the Willi Hennig Society.

[12]  M. Siddall,et al.  Long‐Branch Abstractions , 1999 .

[13]  Pablo A. Goloboff,et al.  Tree Searches Under Sankoff Parsimony , 1998, Cladistics : the international journal of the Willi Hennig Society.

[14]  M. Siddall Success of Parsimony in the Four‐Taxon Case: Long‐Branch Repulsion by Likelihood in the Farris Zone , 1998, Cladistics : the international journal of the Willi Hennig Society.

[15]  J. S. Rogers,et al.  A fast method for approximating maximum likelihoods of phylogenetic trees from nucleotide sequences. , 1998, Systematic biology.

[16]  M. Siddall Prior agreement: arbitration or arbitrary? , 1997, Systematic biology.

[17]  B. Rannala,et al.  Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method. , 1997, Molecular biology and evolution.

[18]  J. S. Rogers,et al.  On the consistency of maximum likelihood estimation of phylogenetic trees from nucleotide sequences. , 1997, Systematic biology.

[19]  M Steel,et al.  Links between maximum likelihood and maximum parsimony under a simple model of site substitution. , 1997, Bulletin of mathematical biology.

[20]  A. Kluge,et al.  Testability and the Refutation and Corroboration of Cladistic Hypotheses , 1997, Cladistics : the international journal of the Willi Hennig Society.

[21]  Joseph T. Chang,et al.  Full reconstruction of Markov models on evolutionary trees: identifiability and consistency. , 1996, Mathematical biosciences.

[22]  P. Goloboff METHODS FOR FASTER PARSIMONY ANALYSIS , 1996, Cladistics : the international journal of the Willi Hennig Society.

[23]  Joseph T. Chang,et al.  Inconsistency of evolutionary tree topology reconstruction methods when substitution rates vary across characters. , 1996, Mathematical biosciences.

[24]  A. Edwards,et al.  The Origin and Early Development of the Method of Minimum Evolution for the Reconstruction of Phylogenetic Trees , 1996 .

[25]  Ziheng Yang Phylogenetic analysis using parsimony and likelihood methods , 1996, Journal of Molecular Evolution.

[26]  Nick Goldman,et al.  MAXIMUM LIKELIHOOD TREES FROM DNA SEQUENCES: A PECULIAR STATISTICAL ESTIMATION PROBLEM , 1995 .

[27]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[28]  Michael D. Hendy,et al.  Parsimony Can Be Consistent , 1993 .

[29]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[30]  Nick Goldman,et al.  MAXIMUM LIKELIHOOD INFERENCE OF PHYLOGENETIC TREES, WITH SPECIAL REFERENCE TO A POISSON PROCESS MODEL OF DNA SUBSTITUTION AND TO PARSIMONY ANALYSES , 1990 .

[31]  J. Hartigan,et al.  Statistical Analysis of Hominoid Molecular Evolution , 1987 .

[32]  J. Felsenstein [Statistical Analysis of Hominoid Molecular Evolution]: Comment , 1987 .

[33]  Daniel Barry,et al.  [Statistical Analysis of Hominoid Molecular Evolution]: Rejoinder , 1987 .

[34]  Joseph Felsenstein,et al.  Parsimony and likelihood: an exchange , 1986 .

[35]  J. Farris DISTANCES AND STATISTICS , 1986, Cladistics : the international journal of the Willi Hennig Society.

[36]  James S. Farms ON THE BOUNDARIES OF PHYLOGENETIC SYSTEMATICS , 1986, Cladistics : the international journal of the Willi Hennig Society.

[37]  E. Sober,et al.  A LIKELIHOOD JUSTIFICATION OF PARSIMONY , 1985, Cladistics : the international journal of the Willi Hennig Society.

[38]  J. Felsenstein Numerical Methods for Inferring Evolutionary Trees , 1982, The Quarterly Review of Biology.

[39]  James S. Farms Simplicity and Informativeness in Systematics and Phylogeny , 1982 .

[40]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[41]  E. Wiley,et al.  Karl R. Popper, Systematics, and Classification: A Reply to Walter Bock and Other Evolutionary Taxonomists , 1975 .

[42]  Joseph Felsenstein,et al.  Maximum Likelihood and Minimum-Steps Methods for Estimating Evolutionary Trees from Data on Discrete Characters , 1973 .

[43]  J. Farris A Probability Model for Inferring Evolutionary Trees , 1973 .

[44]  A. Wald Note on the Consistency of the Maximum Likelihood Estimate , 1949 .

[45]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[46]  J. Farris The Logical Basis of Phylogenetic Analysis , 2004 .

[47]  M. Steel Some statistical aspects of the maximum parsimony method. , 2002, EXS.

[48]  Gonzalo Giribet,et al.  Molecular Systematics and Evolution: Theory and Practice , 2002, EXS 92.

[49]  W. Hennig Phylogenetic Systematics , 2002 .

[50]  Teven,et al.  Philosophy and Phylogenetic Inference: A Comparison of Likelihood and Parsimony Methods in the Context of Karl Popper's Writings on Corroboration , 2001 .

[51]  J. Huelsenbeck,et al.  MRBAYES : Bayesian inference of phylogeny , 2001 .

[52]  M J Sanderson,et al.  Parametric phylogenetics? , 2000, Systematic biology.

[53]  B. Larget,et al.  Markov Chain Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees , 2000 .

[54]  Arun Kumar Pandey,et al.  Cladistics , 2000, Plant Systematics.

[55]  Z. Yang,et al.  How often do wrong models produce better phylogenies? , 1997, Molecular biology and evolution.

[56]  László A. Székely,et al.  Reconstructing Trees When Sequence Sites Evolve at Variable Rates , 1994, J. Comput. Biol..

[57]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.

[58]  S. Gupta,et al.  Statistical decision theory and related topics IV , 1988 .

[59]  J. Neyman MOLECULAR STUDIES OF EVOLUTION: A SOURCE OF NOVEL STATISTICAL PROBLEMS* , 1971 .

[60]  D. Funkenstein Letter to the editor. , 1967, Journal of medical education.