Complex genetic admixture histories reconstructed with Approximate Bayesian Computation

Admixture is a fundamental evolutionary process that has influenced genetic patterns in numerous species. Maximum-likelihood approaches based on allele frequencies and linkage-disequilibrium have been extensively used to infer admixture processes from dense genome-wide datasets mostly in human populations. Nevertheless, complex admixture histories, beyond one or two pulses of admixture, remain methodologically challenging to reconstruct, especially when large datasets are unavailable. We develop an Approximate Bayesian Computations (ABC) framework to reconstruct complex admixture histories from independent genetic markers. We built the software package MetHis to simulate independent SNPs in a two-way admixed population for scenarios with multiple admixture pulses, or monotonically decreasing or increasing admixture at each generation; drawing model-parameter values from prior distributions set by the user. For each simulated dataset, we calculate 24 summary statistics describing genetic diversity and moments of individual admixture fraction. We coupled MetHis with existing ABC algorithms and investigate the admixture history of an African American and a Barbadian population. Results show that Random-Forest ABC scenario-choice, followed by Neural-Network ABC posterior parameter estimation, can distinguish most complex admixture scenarios and provide accurate model-parameter estimations. For both admixed populations, we find that monotonically decreasing contributions over time, from the European and African sources, explain the observed data more accurately than multiple admixture pulses. Furthermore, we find contrasted trajectories of introgression decay from the European and African sources between the two admixed populations. This approach will allow for reconstructing detailed admixture histories in numerous populations and species, particularly when maximum-likelihood methods are intractable.

[1]  R A Fisher,et al.  Darwinian evolution of mutations. , 1922, The Eugenics review.

[2]  O. François,et al.  Approximate Bayesian Computation (ABC) in practice. , 2010, Trends in ecology & evolution.

[3]  Scott M. Williams,et al.  The Great Migration and African-American Genomic Diversity , 2015, bioRxiv.

[4]  Simon H. Martin,et al.  Butterfly genome reveals promiscuous exchange of mimicry adaptations among species , 2012, Nature.

[5]  Ahmed Moussa,et al.  Dating admixture events is unsolved problem in multi-way admixed populations. , 2018, Briefings in bioinformatics.

[6]  W. Ewens,et al.  The transmission/disequilibrium test: history, subdivision, and admixture. , 1995, American journal of human genetics.

[7]  R. Nielsen,et al.  Evidence for archaic adaptive introgression in humans , 2015, Nature Reviews Genetics.

[8]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. , 2003, Genetics.

[9]  Michael W. Mahoney,et al.  PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations , 2007, PLoS genetics.

[10]  W. Ewens,et al.  The Transmission/Disequilibrium Test , 2004 .

[11]  G. Dahlberg,et al.  Genetics of human populations. , 1948, Advances in genetics.

[12]  P. Verdu,et al.  Inference on admixture fractions in a mechanistic model of recurrent admixture. , 2018, Theoretical population biology.

[13]  Joseph K. Pickrell,et al.  Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data , 2012, PLoS genetics.

[14]  Love Dalén,et al.  Ancient Wolf Genome Reveals an Early Divergence of Domestic Dog Ancestors and Admixture into High-Latitude Breeds , 2015, Current Biology.

[15]  L. Excoffier,et al.  Efficient Approximate Bayesian Computation Coupled With Markov Chain Monte Carlo Without Likelihood , 2009, Genetics.

[16]  Yanan Fan,et al.  Handbook of Approximate Bayesian Computation , 2018 .

[17]  F. Jay,et al.  An ABC Method for Whole-Genome Sequence Data: Inferring Paleolithic and Neolithic Human Expansions. , 2019, Molecular biology and evolution.

[18]  Matthieu Foll,et al.  WFABC: a Wright-Fisher ABC-based approach for inferring effective population sizes and selection coefficients from time-sampled data , 2014, bioRxiv.

[19]  Simon Myers,et al.  Fine-Scale Inference of Ancestry Segments Without Prior Knowledge of Admixing Groups , 2019, Genetics.

[20]  Estimating the Timing of Multiple Admixture Pulses During Local Ancestry Inference , 2018, Genetics.

[21]  Yongtao Guan Detecting Structure of Haplotypes and Local Ancestry , 2014, Genetics.

[22]  Guanjie Chen,et al.  Mapping of disease-associated variants in admixed populations , 2011, Genome Biology.

[23]  G. Perry,et al.  Dispersals and genetic adaptation of Bantu-speaking populations in Africa and North America , 2017, Science.

[24]  J. Joets,et al.  Independent introductions and admixtures have contributed to adaptation of European maize and its American counterparts , 2017, PLoS genetics.

[25]  Joseph K. Pickrell,et al.  Inferring Admixture Histories of Human Populations Using Linkage Disequilibrium , 2012, Genetics.

[26]  Christian P. Robert,et al.  Model choice versus model criticism , 2009, Proceedings of the National Academy of Sciences.

[27]  L. Cavalli-Sforza,et al.  High resolution of human evolutionary trees with polymorphic microsatellites , 1994, Nature.

[28]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[29]  Alkes L. Price,et al.  Reconstructing Indian Population History , 2009, Nature.

[30]  R. Nielsen,et al.  Inference of Historical Changes in Migration Rate From the Lengths of Migrant Tracts , 2009, Genetics.

[31]  Flora Jay,et al.  Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach , 2016, bioRxiv.

[32]  Christopher R. Gignoux,et al.  Reconstructing the Population Genetic History of the Caribbean , 2013, PLoS genetics.

[33]  B. Weir,et al.  ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE , 1984, Evolution; international journal of organic evolution.

[34]  Katalin Csill'ery,et al.  abc: an R package for approximate Bayesian computation (ABC) , 2011, 1106.2793.

[35]  Bonnie Berger,et al.  Efficient Moment-Based Inference of Admixture Parameters and Sources of Gene Flow , 2012, Molecular biology and evolution.

[36]  T. Pemberton,et al.  Parallel Trajectories of Genetic and Linguistic Admixture in a Genetically Admixed Creole Population , 2017, Current Biology.

[37]  Christopher R. Gignoux,et al.  Human demographic history impacts genetic risk prediction across diverse populations , 2016, bioRxiv.

[38]  D. Reich,et al.  Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations , 2009, PLoS genetics.

[39]  C. Bustamante,et al.  RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. , 2013, American journal of human genetics.

[40]  S. Myers,et al.  Fine-Scale Inference of Ancestry Segments Without Prior Knowledge of Admixing Groups , 2018, Genetics.

[41]  David H. Alexander,et al.  Fast model-based estimation of ancestry in unrelated individuals. , 2009, Genome research.

[42]  M. Nei,et al.  Estimation of average heterozygosity and genetic distance from a small number of individuals. , 1978, Genetics.

[43]  N. Schneiderman,et al.  Ancestry-specific recent effective population size in the Americas , 2018, PLoS genetics.

[44]  N. Rosenberg,et al.  Autosomal Admixture Levels Are Informative About Sex Bias in Admixed Populations , 2014, Genetics.

[45]  N. Rosenberg distruct: a program for the graphical display of population structure , 2003 .

[46]  I. Berlin The Making of African America: The Four Great Migrations , 2009 .

[47]  Noah A. Rosenberg,et al.  A General Mechanistic Model for Admixture Histories of Hybrid Populations , 2011, Genetics.

[48]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[49]  J. Pritchard,et al.  Admixture facilitates genetic adaptations to high altitude in Tibet , 2014, Nature Communications.

[50]  Jean-Marie Hombert,et al.  Origins and Genetic Diversity of Pygmy Hunter-Gatherers from Western Central Africa , 2009, Current Biology.

[51]  H. Ostrer,et al.  The History of African Gene Flow into Southern Europeans, Levantines, and Jews , 2011, PLoS genetics.

[52]  P. Donnelly,et al.  Inferring coalescence times from DNA sequence data. , 1997, Genetics.

[53]  S. Gravel Population Genetics Models of Local Ancestry , 2012, Genetics.

[54]  Philipp W. Messer,et al.  SLiM 3: Forward Genetic Simulations Beyond the Wright–Fisher Model , 2018, bioRxiv.

[55]  E. Halperin,et al.  Estimating Local Ancestry in Admixed Populations , 2022 .

[56]  Swapan Mallick,et al.  Ancient Admixture in Human History , 2012, Genetics.

[57]  J. Marin,et al.  Deciphering the Routes of invasion of Drosophila suzukii by Means of ABC Random Forest , 2017, Molecular biology and evolution.

[58]  J. Long The genetic structure of admixed populations. , 1991, Genetics.

[59]  Olivier François,et al.  Non-linear regression models for Approximate Bayesian Computation , 2008, Stat. Comput..

[60]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[61]  D. Falush,et al.  A Genetic Atlas of Human Admixture History , 2014, Science.

[62]  W. Rojas,et al.  Genome-wide Ancestry and Demographic History of African-Descendant Maroon Communities from French Guiana and Suriname. , 2017, American journal of human genetics.

[63]  Benjamin M. Neale,et al.  Human demographic history impacts genetic risk prediction across diverse populations , 2016, bioRxiv.

[64]  Lei Tian,et al.  MultiWaver 2.0: modeling discrete and continuous gene flow to reconstruct complex population admixtures , 2018, European Journal of Human Genetics.

[65]  Jean-Michel Marin,et al.  ABC random forests for Bayesian parameter inference , 2019, Bioinform..

[66]  John Wakeley,et al.  Gene Genealogies Within a Fixed Pedigree, and the Robustness of Kingman’s Coalescent , 2012, Genetics.

[67]  S. Wright,et al.  Evolution in Mendelian Populations. , 1931, Genetics.

[68]  Jean-Michel Marin,et al.  Model choice using Approximate Bayesian Computation and Random Forests: analyses based on model grouping to make inferences about the genetic history of Pygmy human populations , 2018 .

[69]  E. Parra,et al.  Exploring Cuba’s population structure and demographic history using genome-wide data , 2018, Scientific Reports.

[70]  Noah A Rosenberg,et al.  AABC: approximate approximate Bayesian computation for inference in population-genetic models. , 2015, Theoretical population biology.

[71]  Jean-Marie Cornuet,et al.  ABC model choice via random forests , 2014, 1406.6288.

[72]  Jean-Marie Cornuet,et al.  DIYABC v2.0: a software to make approximate Bayesian computation inferences about population history using single nucleotide polymorphism, DNA sequence and microsatellite data , 2014, Bioinform..

[73]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[74]  Y. Yuval,et al.  Dominant inheritance in two families with familial Mediterranean fever (FMF). , 1995, American journal of medical genetics.

[75]  Laurent Excoffier,et al.  Fastsimcoal: a Continuous-time Coalescent Simulator of Genomic Diversity under Arbitrarily Complex Evolutionary Scenarios , 2011, Bioinform..

[76]  Josephine C. Miller Atlas of the Transatlantic Slave Trade , 2011 .

[77]  K. Weiss,et al.  Admixture as a tool for finding linked genes and detecting that difference from allelic association between loci. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[78]  L. Excoffier,et al.  Robust Demographic Inference from Genomic and SNP Data , 2013, PLoS genetics.

[79]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[80]  N. Rosenberg,et al.  Beyond 2/3 and 1/3: The Complex Signatures of Sex-Biased Admixture on the X Chromosome , 2015, Genetics.

[81]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[82]  D. Falush,et al.  Inference of Population Structure using Dense Haplotype Data , 2012, PLoS genetics.