Homoplasy and clade support.

Distinguishing phylogenetic signal from homoplasy (shared similarities among taxa that do not arise by common ancestry) is an implicit goal of any phylogenetic study. Large amounts of homoplasy can interfere with accurate tree inference, and it is expected that common measures of clade support, including bootstrap proportions and Bayesian posterior probabilities, should also be impacted to some degree by homoplasy. Through data simulation and analysis of 38 empirical data sets, we show that high amounts of homoplasy will affect all measures of clade support in a manner that is dependent on clade size. More specifically, the smallest taxon bipartitions in an unrooted tree topology will receive higher support relative to clades of intermediate sizes, even when all clades are supported by the same amount of data. We determine that the ultimate causes of this effect are the inclusion of random trees (due to homoplasy) during bootstrap resampling and Markov chain Monte Carlo (MCMC) topology searching and the higher relative proportion of small taxon bipartitions (i.e., 2 or 3 taxa) to larger sized bipartitions. However, the use of explicit model-based methods, especially Bayesian MCMC methods, effectively overcomes this clade size effect even when very small amounts of phylogenetic signal are present. We develop a post hoc statistic, the clade disparity index (CDI), to measure both the relative magnitude of the clade size effect and its statistical significance. In analyses of both simulated and empirical data, CDI values indicate that Bayesian MCMC analyses are substantially more likely to estimate clade support values that are uncorrelated with clade size than are maximum parsimony and maximum likelihood bootstrap analyses and thus less affected by homoplasy. These results may be especially relevant to "deep" phylogenetic problems, such as reconstructing the tree of life, as they represent the largest possible extremes of time and evolutionary rates, 2 factors that cause homoplasy.

[1]  Charles W. Linkem,et al.  Choice of topology estimators in Bayesian phylogenetic analysis. , 2008, Molecular biology and evolution.

[2]  D. Sikes,et al.  Molecular systematics and biogeography of Nicrophorus in part--the investigator species group (Coleoptera: Silphidae) using mixture model MCMC. , 2008, Molecular phylogenetics and evolution.

[3]  James C. Wilgenbusch,et al.  AWTY (are we there yet?): a system for graphical exploration of MCMC convergence in Bayesian phylogenetics , 2008, Bioinform..

[4]  E. Taylor,et al.  Evolutionary and biogeographical patterns within the smelt genus Hypomesus in the North Pacific Ocean , 2007 .

[5]  W. Maddison,et al.  A basal phylogenetic placement for the salticid spider Eupoa, with descriptions of two new species (Araneae: Salticidae) , 2007 .

[6]  J. Macey,et al.  A molecular assessment of phylogenetic relationships and lineage accumulation rates within the family Salamandridae (Amphibia, Caudata). , 2006, Molecular phylogenetics and evolution.

[7]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[8]  L. Tierney Markov Chain Monte Carlo Algorithms , 2006 .

[9]  A. Leaché,et al.  Are unequal clade priors problematic for Bayesian phylogenetics? , 2006, Systematic biology.

[10]  C. P. Randle,et al.  Are Nonuniform Clade Priors Important in Bayesian Phylogenetic Analysis? A Response to Brandley et al. , 2006 .

[11]  M. Donoghue,et al.  Basal cactus phylogeny: implications of Pereskia (Cactaceae) paraphyly for the transition to the cactus life form. , 2005, American journal of botany.

[12]  A. Schmitz,et al.  Partitioned Bayesian analyses, partition choice, and the phylogenetic relationships of scincid lizards. , 2005, Systematic biology.

[13]  B. Rannala,et al.  Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. , 2004, Systematic biology.

[14]  E. Louis,et al.  Molecular phylogenetics of squamata: the position of snakes, amphisbaenians, and dibamids, and the root of the squamate tree. , 2004, Systematic biology.

[15]  K. de Queiroz,et al.  PHYLOGENY, ECOMORPHOLOGICAL EVOLUTION, AND HISTORICAL BIOGEOGRAPHY OF THE ANOLIS CRISTATELLUS SERIES , 2004 .

[16]  T. Castoe,et al.  Data partitions and complex models in Bayesian analysis: the phylogeny of Gymnophthalmid lizards. , 2004, Systematic biology.

[17]  Frank E. Anderson,et al.  Bilaterian Phylogeny Based on Analyses of a Region of the Sodium–Potassium ATPase β-Subunit Gene , 2004, Journal of Molecular Evolution.

[18]  J. Huelsenbeck,et al.  Bayesian phylogenetic analysis of combined data. , 2004, Systematic biology.

[19]  S. Poe,et al.  BIRDS IN A BUSH: FIVE GENES INDICATE EXPLOSIVE EVOLUTION OF AVIAN ORDERS , 2004, Evolution; international journal of organic evolution.

[20]  S. Richter,et al.  Complex data produce better characters. , 2004, Systematic biology.

[21]  K. Pryer,et al.  Phylogenetic Relationships and Evolution of Extant Horsetails, Equisetum, Based on Chloroplast DNA Sequence Data (rbcL and trnL‐F) , 2003, International Journal of Plant Sciences.

[22]  John P. Huelsenbeck,et al.  MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[23]  James M. Pflug,et al.  Molecular systematics of armadillos (Xenarthra, Dasypodidae): contribution of maximum likelihood and Bayesian analyses of mitochondrial and nuclear genes. , 2003, Molecular phylogenetics and evolution.

[24]  T. Swain,et al.  Structural rRNA characters support monophyly of raptorial limbs and paraphyly of limb specialization in water fleas , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[25]  C. Cox,et al.  Phylogenetic relationships within the moss family Bryaceae based on chloroplast DNA evidence. , 2003 .

[26]  B. Danforth,et al.  Phylogeny of eusocial Lasioglossum reveals multiple losses of eusociality within a primitively eusocial clade of bees (Hymenoptera: Halictidae). , 2003, Systematic biology.

[27]  S. Jordan,et al.  Molecular systematics and adaptive radiation of Hawaii's endemic Damselfly genus Megalagrion (Odonata: Coenagrionidae). , 2003, Systematic biology.

[28]  B. G. Baldwin,et al.  Phylogeny and ecological radiation of New World thistles (Cirsium, Cardueae – Compositae) based on ITS and ETS rDNA sequence data , 2002, Molecular ecology.

[29]  O. von Helversen,et al.  Conflicting molecular phylogenies of European long-eared bats (Plecotus) can be explained by cryptic diversity. , 2002, Molecular phylogenetics and evolution.

[30]  Daryl R. Karns,et al.  Phylogenetic Relationships of the Oriental-Australian Rear-Fanged Water Snakes (Colubridae: Homalopsinae) Based on Mitochondrial DNA Sequences , 2002, Copeia.

[31]  Derrick J. Zwickl,et al.  Phylogenetic relationships of the dwarf boas and a comparison of Bayesian and bootstrap measures of phylogenetic support. , 2002, Molecular phylogenetics and evolution.

[32]  M. Källersjö,et al.  Taxon sampling and seed plant phylogeny , 2002, Cladistics : the international journal of the Willi Hennig Society.

[33]  M. Stanhope,et al.  Molecular phylogeny of living xenarthrans and the impact of character and taxon sampling on the placental tree rooting. , 2002, Molecular biology and evolution.

[34]  K. E. Nicholson PHYLOGENETIC ANALYSIS AND A TEST OF THE CURRENT INFRAGENERIC CLASSIFICATION OF NOROPS (BETA ANOLIS) , 2002 .

[35]  J. Huelsenbeck,et al.  Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference , 2002, Bioinform..

[36]  Derrick J. Zwickl,et al.  Increased taxon sampling is advantageous for phylogenetic inference. , 2002, Systematic biology.

[37]  Derrick J. Zwickl,et al.  Increased taxon sampling greatly reduces phylogenetic error. , 2002, Systematic biology.

[38]  D. Reed,et al.  Molecular systematics of the Jacks (Perciformes: Carangidae) based on mitochondrial cytochrome b sequences using parsimony, likelihood, and Bayesian approaches. , 2002, Molecular phylogenetics and evolution.

[39]  J. S. Rogers,et al.  Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. , 2001, Systematic biology.

[40]  D. Hibbett,et al.  Phylogenetic analyses of Aleurodiscus s.l. and allied genera , 2001 .

[41]  J Wöstemeyer,et al.  Phylogeny and origin of 82 zygomycetes from all 54 genera of the Mucorales and Mortierellales based on combined analysis of actin and translation elongation factor EF-1alpha genes. , 2001, Gene.

[42]  J. McGuire,et al.  Phylogenetic systematics of Southeast Asian flying lizards (Iguania: Agamidae: Draco) as inferred from mitochondrial DNA sequence data , 2001 .

[43]  B. Hall,et al.  Phylogenetic relationships among ascomycetes: evidence from an RNA polymerse II subunit. , 1999, Molecular biology and evolution.

[44]  E. Herniou,et al.  Acoel flatworms: earliest extant bilaterian Metazoans, not members of Platyhelminthes. , 1999, Science.

[45]  M A Newton,et al.  Bayesian Phylogenetic Inference via Markov Chain Monte Carlo Methods , 1999, Biometrics.

[46]  J. Garvey,et al.  FROM STAR CHARTS TO STONEFLIES: DETECTING RELATIONSHIPS IN CONTINUOUS BIVARIATE DATA , 1998 .

[47]  C. dePamphilis,et al.  Evolution of plastid gene rps2 in a lineage of hemiparasitic and holoparasitic plants: many losses of photosynthesis and complex patterns of rate variation. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[48]  B. Rannala,et al.  Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method. , 1997, Molecular biology and evolution.

[49]  N. Grassly,et al.  PSeq-Gen: an application for the Monte Carlo simulation of protein sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[50]  B. Mishler,et al.  Phylogenetic relationships of the liverworts (Hepaticae), a basal embryophyte lineage, inferred from nucleotide sequence data of the chloroplast gene rbcL. , 1997, Molecular phylogenetics and evolution.

[51]  M. Georges,et al.  Effects of character weighting and species sampling on phylogeny reconstruction: a case study based on DNA sequence data in cetaceans. , 1996, Genetics.

[52]  D. Hillis Inferring complex phytogenies , 1996, Nature.

[53]  S. Nadler,et al.  Molecular evidence for Acanthocephala as a subtaxon of Rotifera , 1996, Journal of Molecular Evolution.

[54]  N. Pace,et al.  Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[55]  F. Kraus,et al.  The Relationship between s and m and the Retention Index , 1995 .

[56]  J. Huelsenbeck Performance of Phylogenetic Methods in Simulation , 1995 .

[57]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[58]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[59]  J. Farris THE RETENTION INDEX AND THE RESCALED CONSISTENCY INDEX , 1989, Cladistics : the international journal of the Willi Hennig Society.

[60]  James W. Archie,et al.  Homoplasy Excess Ratios: New Indices for Measuring Levels of Homoplasy in Phylogenetic Systematics and a Critique of the Consistency Index , 1989 .

[61]  W. Rice ANALYZING TABLES OF STATISTICAL TESTS , 1989, Evolution; international journal of organic evolution.

[62]  S. Lanyon,et al.  DETECTING INTERNAL INCONSISTENCIES IN DISTANCE DATA , 1985 .

[63]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[64]  J. Felsenstein Cases in which Parsimony or Compatibility Methods will be Positively Misleading , 1978 .

[65]  J. Farris On Comparing the Shapes of Taxonomic Trees , 1973 .

[66]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[67]  S. Höhna Bayesian Phylogenetic Inference , 2011 .

[68]  C. P. Randle,et al.  Strange bayes indeed: uniform topological priors imply non-uniform clade priors. , 2005, Molecular phylogenetics and evolution.

[69]  G. Bush,et al.  Analysis of Mitochondrial DNA and Morphological Characters in the Subtribe Carpomyina (Diptera: Tephritidae) , 2005 .

[70]  B. David,et al.  Phylogeny and biogeography , 2005 .

[71]  F. Ayala Molecular systematics , 2004, Journal of Molecular Evolution.

[72]  A. Rokas,et al.  Lifecycle closure, lineage sorting, and hybridization revealed in a phylogenetic analysis of European oak gallwasps (Hymenoptera: Cynipidae: Cynipini) using mitochondrial sequence data. , 2003, Molecular phylogenetics and evolution.

[73]  Katsumi Tsukamoto,et al.  Basal actinopterygian relationships: a mitogenomic perspective on the phylogeny of the "ancient fish". , 2003, Molecular phylogenetics and evolution.

[74]  M. Donoghue,et al.  Phylogeny and biogeography of Morinaceae (Dipsacales) based on nuclear and chloroplast DNA sequences , 2003 .

[75]  A. Leaché,et al.  Molecular systematics of the Eastern Fence Lizard (Sceloporus undulatus): a comparison of Parsimony, Likelihood, and Bayesian approaches. , 2002, Systematic biology.

[76]  M. Berbee,et al.  Fungal Molecular Evolution: Gene Trees and Geologic Time , 2001 .

[77]  B. Larget,et al.  Markov Chain Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees , 2000 .

[78]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[79]  D. Hillis Inferring complex phylogenies. , 1996, Nature.

[80]  A. Sidow,et al.  Molecular phylogeny. , 1991, Current opinion in genetics & development.

[81]  S. Tavaré Some probabilistic and statistical problems in the analysis of DNA sequences , 1986 .

[82]  V. Rubin Star charts. , 1980, Science.

[83]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .