The probability of monophyly of a sample of gene lineages on a species tree

Monophyletic groups—groups that consist of all of the descendants of a most recent common ancestor—arise naturally as a consequence of descent processes that result in meaningful distinctions between organisms. Aspects of monophyly are therefore central to fields that examine and use genealogical descent. In particular, studies in conservation genetics, phylogeography, population genetics, species delimitation, and systematics can all make use of mathematical predictions under evolutionary models about features of monophyly. One important calculation, the probability that a set of gene lineages is monophyletic under a two-species neutral coalescent model, has been used in many studies. Here, we extend this calculation for a species tree model that contains arbitrarily many species. We study the effects of species tree topology and branch lengths on the monophyly probability. These analyses reveal new behavior, including the maintenance of nontrivial monophyly probabilities for gene lineage samples that span multiple species and even for lineages that do not derive from a monophyletic species group. We illustrate the mathematical results using an example application to data from maize and teosinte.

[1]  J. Degnan,et al.  Multiple merger gene genealogies in two species: Monophyly, paraphyly, and polyphyly for two examples of Lambda coalescents. , 2012, Theoretical population biology.

[2]  E. Thompson,et al.  Ascertainment correction for a population tree via a pruning algorithm for likelihood computation. , 2012, Theoretical population biology.

[3]  Peter J. Bradbury,et al.  Maize HapMap2 identifies extant variation from a genome in flux , 2012, Nature Genetics.

[4]  Xun Xu,et al.  Comparative population genomics of maize domestication and improvement , 2012, Nature Genetics.

[5]  Luay Nakhleh,et al.  The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection , 2012, PLoS genetics.

[6]  Michael Balke,et al.  The Effect of Geographical Scale of Sampling on DNA Barcoding , 2012, Systematic biology.

[7]  Yufeng Wu,et al.  COALESCENT‐BASED SPECIES TREE INFERENCE FROM GENE TREE TOPOLOGIES UNDER INCOMPLETE LINEAGE SORTING BY MAXIMUM LIKELIHOOD , 2012, Evolution; international journal of organic evolution.

[8]  David Bryant,et al.  Next-generation sequencing reveals phylogeographic structure and a species tree for recent bird divergences. , 2009, Molecular phylogenetics and evolution.

[9]  Tanja Stadler,et al.  A polynomial time algorithm for calculating the probability of a ranked gene tree given a species tree , 2012, Algorithms for Molecular Biology.

[10]  Erik Bloomquist,et al.  Inferring species-level phylogenies and taxonomic distinctiveness using multilocus data in Sistrurus rattlesnakes. , 2011, Systematic biology.

[11]  M. Steel,et al.  Clades, clans, and reciprocal monophyly under neutral evolutionary models. , 2011, Theoretical population biology.

[12]  B. Rannala,et al.  Bayesian species delimitation using multilocus sequence data , 2010, Proceedings of the National Academy of Sciences.

[13]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[14]  Scott V Edwards,et al.  Coalescent methods for estimating phylogenetic trees. , 2009, Molecular phylogenetics and evolution.

[15]  Noah A Rosenberg,et al.  Gene tree discordance, phylogenetic inference and the multispecies coalescent. , 2009, Trends in ecology & evolution.

[16]  A. Baker,et al.  Countering criticisms of single mitochondrial DNA gene barcoding in birds , 2009, Molecular ecology resources.

[17]  B. Gaut,et al.  Historical Divergence and Gene Flow in the Genus Zea , 2009, Genetics.

[18]  Carol A. Stepien,et al.  Evolution and phylogeography of the tubenose goby genus Proterorhinus (Gobiidae: Teleostei): evidence for new cryptic species , 2009 .

[19]  E. Thompson,et al.  A two-stage pruning algorithm for likelihood computation for a population tree. , 2008, Genetics.

[20]  S. Efromovich,et al.  Statistical Applications in Genetics and Molecular Biology Coalescent Time Distributions in Trees of Arbitrary Size Sam Efromovich , University of Texas at Dallas , 2011 .

[21]  Bryan C. Carstens,et al.  Delimiting species without monophyletic gene trees. , 2007, Systematic biology.

[22]  Bryan C. Carstens,et al.  INTEGRATING COALESCENT AND ECOLOGICAL NICHE MODELING IN COMPARATIVE PHYLOGEOGRAPHY , 2007, Evolution; international journal of organic evolution.

[23]  Bryan C Carstens,et al.  Estimating species phylogeny from gene-tree probabilities despite incomplete lineage sorting: an example from Melanoplus grasshoppers. , 2007, Systematic biology.

[24]  A. Liston,et al.  Widespread genealogical nonmonophyly in species of Pinus subgenus Strobus. , 2007, Systematic biology.

[25]  N. Rosenberg STATISTICAL TESTS FOR TAXONOMIC DISTINCTIVENESS FROM OBSERVATIONS OF MONOPHYLY , 2007, Evolution; international journal of organic evolution.

[26]  K. de Queiroz,et al.  Species concepts and species delimitation. , 2007, Systematic biology.

[27]  C. Moritz,et al.  DNA barcoding will often fail to discover new animal species over broad parameter space. , 2006, Systematic biology.

[28]  E. Dopman,et al.  Consequences of reproductive barriers for genealogical discordance in the European corn borer. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Steven G. Schroeder,et al.  The Effects of Artificial Selection on the Maize Genome , 2005, Science.

[30]  J. Weinberg,et al.  Phylogeography of surfclams, Spisula solidissima, in the western North Atlantic based on mitochondrial and nuclear DNA sequences , 2005 .

[31]  GENE TREE DISTRIBUTIONS UNDER THE COALESCENT PROCESS , 2005, Evolution; international journal of organic evolution.

[32]  H. Maughan,et al.  Speciation and Selection without Sex , 2005, Hydrobiologia.

[33]  J. W. Sites,et al.  OPERATIONAL CRITERIA FOR DELIMITING SPECIES , 2004 .

[34]  H. Innan,et al.  Pattern of polymorphism after strong artificial selection in a domestication event. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[35]  D. J. Funk,et al.  Species-Level Paraphyly and Polyphyly: Frequency, Causes, and Consequences, with Insights from Animal Mitochondrial DNA , 2003 .

[36]  Noah A Rosenberg,et al.  THE SHAPES OF NEUTRAL GENE GENEALOGIES IN TWO SPECIES: PROBABILITIES OF MONOPHYLY, PARAPHYLY, AND POLYPHYLY IN A COALESCENT MODEL , 2003, Evolution; international journal of organic evolution.

[37]  R. Hudson,et al.  MATHEMATICAL CONSEQUENCES OF THE GENEALOGICAL SPECIES CONCEPT , 2002, Evolution; international journal of organic evolution.

[38]  Noah A Rosenberg,et al.  The probability of topological concordance of gene trees and species trees. , 2002, Theoretical population biology.

[39]  E S Buckler,et al.  Structure of linkage disequilibrium and phenotypic associations in the maize genome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[40]  S. Edwards,et al.  GENE DIVERGENCE , POPULATION DIVERGENCE , AND THE VARIANCE IN COALESCENCE TIME IN PHYLOGEOGRAPHIC STUDIES , 2001 .

[41]  J. Wakeley,et al.  THE EFFECTS OF SUBDIVISION ON THE GENETIC DIVERGENCE OF POPULATIONS AND SPECIES , 2000, Evolution; international journal of organic evolution.

[42]  C. Wu,et al.  The phylogeny of closely related species as revealed by the genealogy of a speciation gene, Odysseus. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Jody Hey,et al.  The limits of selection during maize domestication , 1999, Nature.

[44]  Andrew G. Stephenson,et al.  Experimental and Molecular Approaches to Plant Biosystematics , 1997 .

[45]  C. Moritz Defining 'Evolutionarily Significant Units' for conservation. , 1994, Trends in ecology & evolution.

[46]  Chung-I Wu,et al.  Inferences of species phylogeny in relation to segregation of ancient polymorphisms. , 1991, Genetics.

[47]  M Slatkin,et al.  Genealogy of neutral genes in two partially isolated populations. , 1990, Theoretical population biology.

[48]  John C. Avise,et al.  PHYLOGENETIC RELATIONSHIPS OF MITOCHONDRIAL DNA UNDER VARIOUS DEMOGRAPHIC MODELS OF SPECIATION , 1986 .

[49]  M. Nei,et al.  Gene genealogy and variance of interpopulational nucleotide differences. , 1985, Genetics.

[50]  S. Tavaré,et al.  Line-of-descent and genealogical processes, and their applications in population genetics models. , 1984, Theoretical population biology.

[51]  F. Tajima Evolutionary relationship of DNA sequences in finite populations. , 1983, Genetics.