Detection and Polarization of Introgression in a Five-taxon Phylogeny

When multiple speciation events occur rapidly in succession, discordant genealogies due to incomplete lineage sorting (ILS) can complicate the detection of introgression. A variety of methods, including the [Formula: see text]-statistic (a.k.a. the "ABBA-BABA test"), have been proposed to infer introgression in the presence of ILS for a four-taxon clade. However, no integrated method exists to detect introgression using allelic patterns for more complex phylogenies. Here we explore the issues associated with previous systems of applying [Formula: see text]-statistics to a larger tree topology, and propose new [Formula: see text] tests as an integrated framework to infer both the taxa involved in and the direction of introgression for a symmetric five-taxon phylogeny. Using theory and simulations, we show that the [Formula: see text] statistics correctly identify the introgression donor and recipient lineages, even at low rates of introgression. [Formula: see text] is also shown to have extremely low false-positive rates. The [Formula: see text] tests are computationally inexpensive to calculate and can easily be applied to phylogenomic data sets, both genome-wide and in windows of the genome. In addition, we explore both the principles and problems of introgression detection in even more complex phylogenies.

[1]  Swapan Mallick,et al.  Ancient Admixture in Human History , 2012, Genetics.

[2]  Simon H. Martin,et al.  Evaluating the Use of ABBA–BABA Statistics to Locate Introgressed Loci , 2014, bioRxiv.

[3]  S. Edwards IS A NEW AND GENERAL THEORY OF MOLECULAR SYSTEMATICS EMERGING? , 2009, Evolution; international journal of organic evolution.

[4]  P. Beerli,et al.  Effect of unsampled populations on the estimation of population sizes and migration rates between sampled populations , 2004, Molecular ecology.

[5]  M. Holder,et al.  Difficulties in detecting hybridization. , 2001, Systematic biology.

[6]  Noah A Rosenberg,et al.  The probability of topological concordance of gene trees and species trees. , 2002, Theoretical population biology.

[7]  T. Sang,et al.  Testing hybridization hypotheses based on incongruent gene trees. , 2000, Systematic biology.

[8]  Laurent A. F. Frantz,et al.  Neandertal Admixture in Eurasia Confirmed by Maximum-Likelihood Analysis of Three Genomes , 2014, Genetics.

[9]  Deren A. R. Eaton,et al.  Inferring Phylogeny and Introgression using RADseq Data: An Example from Flowering Plants (Pedicularis: Orobanchaceae) , 2013, Systematic biology.

[10]  D. Reich,et al.  Denisova admixture and the first modern human dispersals into Southeast Asia and Oceania. , 2011, American journal of human genetics.

[11]  M. Slatkin Seeing ghosts: the effect of unsampled populations on migration rates estimated for sampled populations , 2004, Molecular ecology.

[12]  L. Excoffier,et al.  The Hidden Side of Invasions: Massive Introgression by Local Genes , 2008, Evolution; international journal of organic evolution.

[13]  M. Kronforst,et al.  Do Heliconius butterfly species exchange mimicry alleles? , 2013, Biology Letters.

[14]  Kevin R. Thornton,et al.  Genome sequencing reveals complex speciation in the Drosophila simulans clade , 2012, Genome research.

[15]  Patricia A. McLenachan,et al.  A Statistical Approach for Distinguishing Hybridization and Incomplete Lineage Sorting , 2009, The American Naturalist.

[16]  Luay Nakhleh,et al.  Parsimonious inference of hybridization in the presence of incomplete lineage sorting. , 2013, Systematic biology.

[17]  Noah A Rosenberg,et al.  Gene tree discordance, phylogenetic inference and the multispecies coalescent. , 2009, Trends in ecology & evolution.

[18]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[19]  Noah A. Rosenberg,et al.  Counting Coalescent Histories , 2007, J. Comput. Biol..

[20]  Daniel H. Huson,et al.  Reconstruction of Reticulate Networks from Gene Trees , 2005, RECOMB.

[21]  Laura Salter Kubatko,et al.  Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: a model. , 2009, Theoretical population biology.

[22]  F. Tajima Evolutionary relationship of DNA sequences in finite populations. , 1983, Genetics.

[23]  Philip L. F. Johnson,et al.  A Draft Sequence of the Neandertal Genome , 2010, Science.

[24]  Ying Song,et al.  An HMM-Based Comparative Genomic Framework for Detecting Introgression in Eukaryotes , 2013, PLoS Comput. Biol..

[25]  M. Slatkin,et al.  Ancient structure in Africa unlikely to explain Neanderthal and non-African genetic similarity. , 2012, Molecular biology and evolution.

[26]  Rob J. Kulathinal,et al.  The Genomics of Speciation in Drosophila: Diversity, Divergence, and Introgression Estimated Using Low-Coverage Genome Sequencing , 2009, PLoS genetics.

[27]  Richard R. Hudson,et al.  TESTING THE CONSTANT‐RATE NEUTRAL ALLELE MODEL WITH PROTEIN SEQUENCE DATA , 1983, Evolution; international journal of organic evolution.

[28]  Alkes L. Price,et al.  Reconstructing Indian Population History , 2009, Nature.

[29]  T. Sicheritz-Pontén,et al.  Speciation with gene flow in equids despite extensive chromosomal plasticity , 2014, Proceedings of the National Academy of Sciences.

[30]  N. Rosenberg,et al.  Discordance of Species Trees with Their Most Likely Gene Trees , 2006, PLoS genetics.

[31]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[32]  R. Ennos,et al.  Next-generation hybridization and introgression , 2011, Heredity.

[33]  Xiaofang Jiang,et al.  Extensive introgression in a malaria vector species complex revealed by phylogenomics , 2015, Science.

[34]  M. Nei,et al.  Relationships between gene trees and species trees. , 1988, Molecular biology and evolution.

[35]  Luay Nakhleh,et al.  The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection , 2012, PLoS genetics.

[36]  David Reich,et al.  Testing for ancient admixture between closely related populations. , 2011, Molecular biology and evolution.

[37]  Anders Eriksson,et al.  Effect of ancient population structure on the degree of polymorphism shared between modern human populations and ancient hominins , 2012, Proceedings of the National Academy of Sciences.

[38]  Simon H. Martin,et al.  Genome-wide evidence for speciation with gene flow in Heliconius butterflies , 2013, Genome research.

[39]  S. Joly JML: testing hybridization from species trees , 2012, Molecular ecology resources.

[40]  M. Siol,et al.  EggLib: processing, analysis and simulation tools for population genetics and genomics , 2012, BMC Genetics.