Discordance of species trees with their most likely gene trees: the case of five taxa.

Under a coalescent model for within-species evolution, gene trees may differ from species trees to such an extent that the gene tree topology most likely to evolve along the branches of a species tree can disagree with the species tree topology. Gene tree topologies that are more likely to be produced than the topology that matches that of the species tree are termed anomalous, and the region of branch-length space that gives rise to anomalous gene trees (AGTs) is the anomaly zone. We examine the occurrence of anomalous gene trees for the case of five taxa, the smallest number of taxa for which every species tree topology has a nonempty anomaly zone. Considering all sets of branch lengths that give rise to anomalous gene trees, the largest value possible for the smallest branch length in the species tree is greater in the five-taxon case (0.1934 coalescent time units) than in the previously studied case of four taxa (0.1568). The five-taxon case demonstrates the existence of three phenomena that do not occur in the four-taxon case. First, anomalous gene trees can have the same unlabeled topology as the species tree. Second, the anomaly zone does not necessarily enclose a ball centered at the origin in branch-length space, in which all branches are short. Third, as a branch length increases, it is possible for the number of AGTs to increase rather than decrease or remain constant. These results, which help to describe how the properties of anomalous gene trees increase in complexity as the number of taxa increases, will be useful in formulating strategies for evading the problem of anomalous gene trees during species tree inference from multilocus data.

[1]  Noah A Rosenberg,et al.  The probability of topological concordance of gene trees and species trees. , 2002, Theoretical population biology.

[2]  Noah A. Rosenberg,et al.  Counting Coalescent Histories , 2007, J. Comput. Biol..

[3]  James K. M. Brown Probabilities of Evolutionary Trees , 1994 .

[4]  R. Nichols,et al.  Gene trees and species trees are not the same. , 2001, Trends in ecology & evolution.

[5]  M Steel,et al.  Properties of phylogenetic trees generated by Yule-type speciation models. , 2001, Mathematical biosciences.

[6]  Stephen M. Krone,et al.  On the Meaning and Existence of an Effective Population Size , 2005, Genetics.

[7]  D. Aldous Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today , 2001 .

[8]  N. Takahata Gene genealogy in three related populations: consistency probability between gene and population trees. , 1989, Genetics.

[9]  M. Nei,et al.  Relationships between gene trees and species trees. , 1988, Molecular biology and evolution.

[10]  Noah A. Rosenberg,et al.  The Mean and Variance of the Numbers of r-Pronged Nodes and r-Caterpillars in Yule-Generated Genealogical Trees , 2006 .

[11]  D. Balding,et al.  Handbook of statistical genetics , 2004 .

[12]  E. Harding The probabilities of rooted tree-shapes generated by random bifurcation , 1971, Advances in Applied Probability.

[13]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[14]  N. Rosenberg,et al.  Discordance of Species Trees with Their Most Likely Gene Trees , 2006, PLoS genetics.

[15]  S. Tavaré,et al.  Line-of-descent and genealogical processes, and their applications in population genetics models. , 1984, Theoretical population biology.

[16]  M. Nordborg,et al.  Coalescent Theory , 2019, Handbook of Statistical Genomics.

[17]  M. Nei,et al.  Gene genealogy and variance of interpopulational nucleotide differences. , 1985, Genetics.

[18]  Carsten Wiuf,et al.  Gene Genealogies, Variation and Evolution - A Primer in Coalescent Theory , 2004 .