Parsimony overcomes statistical inconsistency with the addition of more data from the same gene

Many authors have demonstrated that the parsimony method of phylogenetic analysis can fail to estimate phylogeny accurately under certain conditions when data follow a model that stipulates homogeneity of the evolutionary process. These demonstrations further show that no matter how much data are added, parsimony will forever exhibit this statistical inconsistency if the additional data have the same distributional properties as the original data. This final component—that the additional data must follow the same distribution as the original data—is crucial to the demonstration. Recent simulations show, however, that if data evolve heterogeneously, parsimony can perform consistently. Here we show, using natural data, that parsimony can overcome inconsistency if new data from the same gene are added to an analysis already exhibiting a condition indistinguishable from inconsistency.

[1]  M. Steel,et al.  A covariotide model explains apparent phylogenetic structure of oxygenic photosynthetic lineages. , 1998, Molecular biology and evolution.

[2]  James M. Carpenter,et al.  ON SIMULTANEOUS ANALYSIS , 1996, Cladistics : the international journal of the Willi Hennig Society.

[3]  J. Wenzel Application of the biogenetic law to behavioral ontogeny: a test using nest architecture in paper wasps , 1993 .

[4]  Tal Pupko,et al.  A covarion-based method for detecting molecular adaptation: application to the evolution of primate mitochondrial genomes , 2002, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[5]  Bernard J. Crespi,et al.  Do long branches attract flies? , 1995, Nature.

[6]  Taran Grant,et al.  Data exploration in phylogenetic inference: scientific, heuristic, or neither , 2003, Cladistics : the international journal of the Willi Hennig Society.

[7]  G. Giribet,et al.  TNT: Tree Analysis Using New Technology , 2005 .

[8]  W. Fitch,et al.  An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution , 1970, Biochemical Genetics.

[9]  Bryan Kolaczkowski,et al.  Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous , 2004, Nature.

[10]  W. Wheeler Implied alignment: a synapomorphy‐based multiple‐sequence alignment method and its use in cladogram search , 2003, Cladistics : the international journal of the Willi Hennig Society.

[11]  Mike Steel,et al.  Should phylogenetic models be trying to "fit an elephant"? , 2005, Trends in genetics : TIG.

[12]  M. Siddall,et al.  Probabilism and Phylogenetic Inference , 1997, Cladistics : the international journal of the Willi Hennig Society.

[13]  J. Thompson,et al.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. , 1997, Nucleic acids research.

[14]  J. Farris Likelihood and Inconsistency , 1999, Cladistics : the international journal of the Willi Hennig Society.

[15]  M Steel,et al.  Links between maximum likelihood and maximum parsimony under a simple model of site substitution. , 1997, Bulletin of mathematical biology.

[16]  J. Bergsten A review of long‐branch attraction , 2005, Cladistics : the international journal of the Willi Hennig Society.

[17]  Edward Susko,et al.  Likelihood, parsimony, and heterogeneous evolution. , 2005, Molecular biology and evolution.

[18]  M J Sanderson,et al.  Parametric phylogenetics? , 2000, Systematic biology.

[19]  J. Farris A Probability Model for Inferring Evolutionary Trees , 1973 .

[20]  M M Miyamoto,et al.  Function-structure analysis of proteins using covarion-based evolutionary approaches: Elongation factors. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  M Steel,et al.  Invariable sites models and their use in phylogeny reconstruction. , 2000, Systematic biology.

[22]  J. Farris,et al.  Quantitative Phyletics and the Evolution of Anurans , 1969 .

[23]  Michael D. Hendy,et al.  A Framework for the Quantitative Study of Evolutionary Trees , 1989 .

[24]  Walter M. Fitch,et al.  The molecular evolution of cytochrome c in eukaryotes , 1976, Journal of Molecular Evolution.

[25]  M. Miyamoto,et al.  Testing the covarion hypothesis of molecular evolution. , 1995, Molecular biology and evolution.

[26]  Pablo A. Goloboff,et al.  Parsimony, likelihood, and simplicity , 2003 .

[27]  Eric A Gaucher,et al.  A call for likelihood phylogenetics even when the process of sequence evolution is heterogeneous. , 2005, Molecular phylogenetics and evolution.

[28]  W R Taylor,et al.  Coevolving protein residues: maximum likelihood identification and relationship to structure. , 1999, Journal of molecular biology.

[29]  J. G. Burleigh,et al.  Covarion structure in plastid genome evolution: a new statistical test. , 2005, Molecular biology and evolution.

[30]  László A. Székely,et al.  A complete family of phylogenetic invariants for any number of taxa under Kimura's 3ST model , 1993 .

[31]  B. Müller-Hill,et al.  On the conservation of protein sequences in evolution. , 2000, Trends in biochemical sciences.

[32]  R. Zucchi,et al.  Evolution of Caste in Neotropical Swarm-Founding Wasps(Hymenoptera: Vespidae; Epiponini) , 2004 .

[33]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[34]  M. Siddall,et al.  Long‐Branch Abstractions , 1999 .

[35]  J. Huelsenbeck,et al.  SUCCESS OF PHYLOGENETIC METHODS IN THE FOUR-TAXON CASE , 1993 .

[36]  H. Philippe,et al.  Heterotachy, an important process of protein evolution. , 2002, Molecular biology and evolution.

[37]  A. Kluge A Concern for Evidence and a Phylogenetic Hypothesis of Relationships among Epicrates (Boidae, Serpentes) , 1989 .

[38]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[39]  M. Steel,et al.  Modeling the covarion hypothesis of nucleotide substitution. , 1998, Mathematical biosciences.

[40]  An Empirical Analysis of mt 16S rRNA Covarion-Like Evolution in Insects: Site-Specific Rate Variation Is Clustered and Frequently Detected , 2002, Journal of Molecular Evolution.

[41]  Joseph T. Chang,et al.  Inconsistency of evolutionary tree topology reconstruction methods when substitution rates vary across characters. , 1996, Mathematical biosciences.

[42]  J. Huelsenbeck Is the Felsenstein zone a fly trap? , 1997, Systematic biology.