Evaluating the Use of ABBA–BABA Statistics to Locate Introgressed Loci

Several methods have been proposed to test for introgression across genomes. One method tests for a genome-wide excess of shared derived alleles between taxa using Patterson’s D statistic, but does not establish which loci show such an excess or whether the excess is due to introgression or ancestral population structure. Several recent studies have extended the use of D by applying the statistic to small genomic regions, rather than genome-wide. Here, we use simulations and whole-genome data from Heliconius butterflies to investigate the behavior of D in small genomic regions. We find that D is unreliable in this situation as it gives inflated values when effective population size is low, causing D outliers to cluster in genomic regions of reduced diversity. As an alternative, we propose a related statistic f^d, a modified version of a statistic originally developed to estimate the genome-wide fraction of admixture. f^d is not subject to the same biases as D, and is better at identifying introgressed loci. Finally, we show that both D and f^d outliers tend to cluster in regions of low absolute divergence (dXY), which can confound a recently proposed test for differentiating introgression from shared ancestral variation at individual loci.

[1]  Simon H. Martin,et al.  Butterfly genome reveals promiscuous exchange of mimicry adaptations among species , 2012, Nature.

[2]  C. Bustamante,et al.  Inferring genome-wide patterns of admixture in Qataris using fifty-five ancestral populations , 2012, BMC Genetics.

[3]  Chung-I Wu The genic view of the process of speciation , 2001 .

[4]  Wei Zhang,et al.  Hybridization Reveals the Evolving Genomic Architecture of Speciation , 2013, Cell reports.

[5]  August E. Woerner,et al.  Higher Levels of Neanderthal Ancestry in East Asians than in Europeans , 2013, Genetics.

[6]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[7]  W. Cresko,et al.  Extensive linkage disequilibrium and parallel adaptive divergence across threespine stickleback genomes , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[8]  Swapan Mallick,et al.  Ancient Admixture in Human History , 2012, Genetics.

[9]  Philip L. F. Johnson,et al.  A Draft Sequence of the Neandertal Genome , 2010, Science.

[10]  Jody Hey,et al.  Divergence with Gene Flow: Models and Data , 2010 .

[11]  M. Noor,et al.  Islands of speciation or mirages in the desert? Examining the role of restricted recombination in maintaining species , 2009, Heredity.

[12]  M. Kronforst,et al.  Do Heliconius butterfly species exchange mimicry alleles? , 2013, Biology Letters.

[13]  Philipp W. Messer,et al.  Genome Patterns of Selection and Introgression of Haplotypes in Natural Populations of the House Mouse (Mus musculus) , 2012, PLoS genetics.

[14]  Alkes L. Price,et al.  Reconstructing Indian Population History , 2009, Nature.

[15]  Nicholas H. Barton,et al.  Genetic analysis of hybrid zones , 1993 .

[16]  Pall I. Olason,et al.  The genomic landscape of species divergence in Ficedula flycatchers , 2012, Nature.

[17]  N. Barton,et al.  STRONG NATURAL SELECTION IN A WARNING‐COLOR HYBRID ZONE , 1989, Evolution; international journal of organic evolution.

[18]  Deren A. R. Eaton,et al.  Inferring Phylogeny and Introgression using RADseq Data: An Example from Flowering Plants (Pedicularis: Orobanchaceae) , 2013, Systematic biology.

[19]  U. Dieckmann,et al.  Hybridization and speciation , 2013, Journal of evolutionary biology.

[20]  Anders Eriksson,et al.  Effect of ancient population structure on the degree of polymorphism shared between modern human populations and ancient hominins , 2012, Proceedings of the National Academy of Sciences.

[21]  David Reich,et al.  Testing for ancient admixture between closely related populations. , 2011, Molecular biology and evolution.

[22]  Asan,et al.  Altitude adaptation in Tibet caused by introgression of Denisovan-like DNA , 2014, Nature.

[23]  B. Payseur,et al.  Genomic signatures of selection at linked sites: unifying the disparity among species , 2013, Nature Reviews Genetics.

[24]  N. Besansky,et al.  No evidence for biased co-transmission of speciation islands in Anopheles gambiae , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[25]  Matthew W. Hahn,et al.  Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow , 2014, Molecular ecology.

[26]  B. Charlesworth Measures of divergence between populations and the effect of forces that reduce variability. , 1998, Molecular biology and evolution.

[27]  C. Jiggins,et al.  Adaptive Introgression across Species Boundaries in Heliconius Butterflies , 2012, PLoS genetics.

[28]  Hadley Wickham,et al.  Reshaping Data with the reshape Package , 2007 .

[29]  M. Noor,et al.  Islands of speciation or mirages in the desert? Examining the role of restricted recombination in maintaining species , 2010, Heredity.

[30]  M. Groenen,et al.  Genomic analysis reveals selection for Asian genes in European pigs following human-mediated introgression , 2014, Nature Communications.

[31]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[32]  A. Hendry,et al.  Genome divergence during evolutionary diversification as revealed in replicate lake-stream stickleback population pairs. , 2012, Molecular ecology.

[33]  Hadley Wickham,et al.  The Split-Apply-Combine Strategy for Data Analysis , 2011 .

[34]  D. Reich,et al.  Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations , 2009, PLoS genetics.

[35]  David B. Witonsky,et al.  Reconstructing Native American Population History , 2012, Nature.

[36]  Kevin R. Thornton,et al.  Genome sequencing reveals complex speciation in the Drosophila simulans clade , 2012, Genome research.

[37]  Jeff Price Hybrid Zones and the Evolutionary Process , 1993 .

[38]  Nicolas Bierne,et al.  Crossing the species barrier: genomic hotspots of introgression between two highly divergent Ciona intestinalis species. , 2013, Molecular biology and evolution.

[39]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..

[40]  M. Siol,et al.  EggLib: processing, analysis and simulation tools for population genetics and genomics , 2012, BMC Genetics.

[41]  Wei-Chen Chen,et al.  Overlapping codon model, phylogenetic clustering, and alternative partial expectation conditional maximization algorithm , 2011 .

[42]  C. Bustamante,et al.  RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. , 2013, American journal of human genetics.

[43]  Ziheng Yang A Likelihood Ratio Test of Speciation with Gene Flow Using Genomic Sequence Data , 2010, Genome biology and evolution.

[44]  Simon H. Martin,et al.  Genome-wide evidence for speciation with gene flow in Heliconius butterflies , 2013, Genome research.

[45]  D. Falush,et al.  Inference of Population Structure using Dense Haplotype Data , 2012, PLoS genetics.

[46]  Swapan Mallick,et al.  The genomic landscape of Neanderthal ancestry in present-day humans. , 2016 .

[47]  M. Fujita,et al.  Introgression and phenotypic assimilation in Zimmerius flycatchers (Tyrannidae): population genetic and phylogenetic inferences from genome-wide SNPs. , 2014, Systematic biology.

[48]  Jake K. Byrnes,et al.  Genomic Ancestry of North Africans Supports Back-to-Africa Migrations , 2012, PLoS genetics.

[49]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[50]  M. Jakobsson,et al.  Joint analysis of demography and selection in population genetics: where do we stand and where could we go? , 2012, Molecular ecology.

[51]  M. Slatkin,et al.  Ancient structure in Africa unlikely to explain Neanderthal and non-African genetic similarity. , 2012, Molecular biology and evolution.

[52]  J. Marchini,et al.  Multiway Admixture Deconvolution Using Phased or Unphased Ancestral Panels , 2013, Genetic epidemiology.

[53]  Rob J. Kulathinal,et al.  The Genomics of Speciation in Drosophila: Diversity, Divergence, and Introgression Estimated Using Low-Coverage Genome Sequencing , 2009, PLoS genetics.

[54]  E. Halperin,et al.  Estimating Local Ancestry in Admixed Populations , 2022 .