GENE TREE DISTRIBUTIONS UNDER THE COALESCENT PROCESS

Abstract Under the coalescent model for population divergence, lineage sorting can cause considerable variability in gene trees generated from any given species tree. In this paper, we derive a method for computing the distribution of gene tree topologies given a bifurcating species tree for trees with an arbitrary number of taxa in the case that there is one gene sampled per species. Applications for gene tree distributions include determining exact probabilities of topological equivalence between gene trees and species trees and inferring species trees from multiple datasets. In addition, we examine the shapes of gene tree distributions and their sensitivity to changes in branch lengths, species tree shape, and tree size. The method for computing gene tree distributions is implemented in the computer program COAL.

[1]  M. Nei,et al.  Gene genealogy and variance of interpopulational nucleotide differences. , 1985, Genetics.

[2]  T. Sang,et al.  Testing hybridization hypotheses based on incongruent gene trees. , 2000, Systematic biology.

[3]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[4]  D. Aldous Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today , 2001 .

[5]  W. Feller,et al.  An Introduction to Probability Theory and Its Application. , 1951 .

[6]  Horizontal Gene Flow: Evidence and Possible Consequences , 1994 .

[7]  F. Tajima Evolutionary relationship of DNA sequences in finite populations. , 1983, Genetics.

[8]  Chung-I Wu,et al.  Inferences of species phylogeny in relation to segregation of ancient polymorphisms. , 1991, Genetics.

[9]  G. Yule,et al.  A Mathematical Theory of Evolution, Based on the Conclusions of Dr. J. C. Willis, F.R.S. , 1925 .

[10]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[11]  N. Takahata Gene genealogy in three related populations: consistency probability between gene and population trees. , 1989, Genetics.

[12]  M. Nei,et al.  Relationships between gene trees and species trees. , 1988, Molecular biology and evolution.

[13]  W. Maddison Gene Trees in Species Trees , 1997 .

[14]  Ziheng Yang,et al.  Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci. , 2003, Genetics.

[15]  Ziheng Yang,et al.  Likelihood and Bayes estimation of ancestral population sizes in hominoids using data from multiple loci. , 2002, Genetics.

[16]  G. A. Watterson Lines of descent and the coalescent , 1984 .

[17]  Noah A Rosenberg,et al.  THE SHAPES OF NEUTRAL GENE GENEALOGIES IN TWO SPECIES: PROBABILITIES OF MONOPHYLY, PARAPHYLY, AND POLYPHYLY IN A COALESCENT MODEL , 2003, Evolution; international journal of organic evolution.

[18]  J. Hein A heuristic method to reconstruct the history of sequences subject to recombination , 1993, Journal of Molecular Evolution.

[19]  Steven Poe,et al.  BIRDS IN A BUSH: FIVE GENES INDICATE EXPLOSIVE EVOLUTION OF AVIAN ORDERS , 2004, Evolution; international journal of organic evolution.

[20]  M. Syvanen Horizontal gene transfer: evidence and possible consequences. , 1994, Annual review of genetics.

[21]  S. Tavaré,et al.  Line-of-descent and genealogical processes, and their applications in population genetics models. , 1984, Theoretical population biology.

[22]  Noah A Rosenberg,et al.  The probability of topological concordance of gene trees and species trees. , 2002, Theoretical population biology.

[23]  Kenneth H. Rosen,et al.  Discrete Mathematics and its applications , 2000 .

[24]  S. Carroll,et al.  Genome-scale approaches to resolving incongruence in molecular phylogenies , 2003, Nature.