Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation

Summary: Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for simple and standard situations typically involving a small number of samples. We propose here a computer program (DIY ABC) for inference based on approximate Bayesian computation (ABC), in which scenarios can be customized by the user to fit many complex situations involving any number of populations and samples. Such scenarios involve any combination of population divergences, admixtures and population size changes. DIY ABC can be used to compare competing scenarios, estimate parameters for one or more scenarios and compute bias and precision measures for a given scenario and known values of parameters (the current version applies to unlinked microsatellite data). This article describes key methods used in the program and provides its main features. The analysis of one simulated and one real dataset, both with complex evolutionary scenarios, illustrates the main possibilities of DIY ABC. Availability: The software DIY ABC is freely available at http://www.montpellier.inra.fr/CBGP/diyabc. Contact: j.cornuet@imperial.ac.uk Supplementary information: Supplementary data are also available at http://www.montpellier.inra.fr/CBGP/diyabc

[1]  B. Rannala,et al.  Detecting immigration by using multilocus genotypes. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Arnaud Estoup,et al.  Homoplasy and mutation model at microsatellite loci and their consequences for population genetics analysis , 2002, Molecular ecology.

[3]  L. Excoffier,et al.  Efficient Approximate Bayesian Computation Coupled With Markov Chain Monte Carlo Without Likelihood , 2009, Genetics.

[4]  Cécile Fizames,et al.  A comprehensive genetic map of the human genome based on 5,264 microsatellites , 1996, Nature.

[5]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[6]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[7]  M. De Iorio,et al.  Importance sampling on coalescent histories. II: Subdivided population models , 2004, Advances in Applied Probability.

[8]  Jinliang Wang Maximum-likelihood estimation of admixture proportions from genetic data. , 2003, Genetics.

[9]  Arnaud Estoup,et al.  Multiple Transatlantic Introductions of the Western Corn Rootworm , 2005, Science.

[10]  Motoo Kimura,et al.  A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population*. , 1973, Genetical research.

[11]  M W Feldman,et al.  An evaluation of genetic distances for use with microsatellite loci. , 1994, Genetics.

[12]  R. Nielsen,et al.  Multilocus Methods for Estimating Population Sizes, Migration Rates and Divergence Time, With Applications to the Divergence of Drosophila pseudoobscura and D. persimilis , 2004, Genetics.

[13]  A. von Haeseler,et al.  Inference of population history using a likelihood approach. , 1998, Genetics.

[14]  L. Excoffier,et al.  Inferring admixture proportions from molecular data. , 1998, Molecular biology and evolution.

[15]  Mark A. Beaumont,et al.  Joint determination of topology, divergence time, and immigration in population trees , 2008 .

[16]  François Rousset,et al.  GENEPOP (version 1.2): population genetic software for exact tests and ecumenicism , 1995 .

[17]  Arnaud Estoup,et al.  Genetic consequences of sequential founder events by an island-colonizing bird , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  P. Donnelly,et al.  Inferring coalescence times from DNA sequence data. , 1997, Genetics.

[19]  Jean-Marie Cornuet,et al.  GENETIC ANALYSIS OF COMPLEX DEMOGRAPHIC SCENARIOS: SPATIALLY EXPANDING POPULATIONS OF THE CANE TOAD, BUFO MARINUS , 2004, Evolution; international journal of organic evolution.

[20]  Laurent Excoffier,et al.  SIMCOAL 2.0: a program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history , 2004, Bioinform..

[21]  M. Beaumont Detecting population expansion and decline using microsatellites. , 1999, Genetics.

[22]  M. Nordborg,et al.  Coalescent Theory , 2019, Handbook of Statistical Genomics.

[23]  M. Nei Molecular Evolutionary Genetics , 1987 .

[24]  A. Estoup,et al.  Bayesian inferences on the recent island colonization history by the bird Zosterops lateralis lateralis , 2003, Molecular ecology.

[25]  R. Chakraborty,et al.  Simultaneous estimation of all the parameters of a stepwise mutation model. , 1998, Genetics.

[26]  David J. Balding,et al.  Inferences from DNA data: population histories, evolutionary processes and forensic match probabilities , 2003 .

[27]  L. Jin,et al.  A unified approach to study hypervariable polymorphisms: statistical considerations of determining relatedness and population distances. , 1993, EXS.

[28]  R. Huey,et al.  Introduction history of Drosophila subobscura in the New World: a microsatellite‐based survey using ABC methods , 2007, Molecular ecology.

[29]  Jean-Marie Cornuet,et al.  Bayesian Analysis of an Admixture Model With Mutations and Arbitrarily Linked Markers , 2005, Genetics.

[30]  L. Excoffier,et al.  Statistical evaluation of alternative models of human evolution , 2007, Proceedings of the National Academy of Sciences.

[31]  J. Garza,et al.  Detection of reduction in population size using data from microsatellite loci , 2001, Molecular ecology.

[32]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[33]  P. Donnelly,et al.  Inference in molecular population genetics , 2000 .

[34]  J. Cornuet,et al.  Hybrid origins of honeybees from Italy (Apis mellifera ligustica) and Sicily (A. m. sicula) , 2000, Molecular ecology.

[35]  Shuichi Matsumura,et al.  Simulations, Genetics and Human Prehistory , 2008 .

[36]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[37]  Mark A. Beaumont,et al.  Microsatellite analysis of genetic diversity in fragmented South African buffalo populations , 1998 .

[38]  M. Beaumont Estimation of population growth or decline in genetically monitored populations. , 2003, Genetics.

[39]  B. Weir,et al.  ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE , 1984, Evolution; international journal of organic evolution.

[40]  D. Balding,et al.  Handbook of statistical genetics , 2004 .

[41]  J. Cornuet,et al.  Estimating admixture proportions with microsatellites: comparison of methods based on simulated data , 2004, Molecular ecology.

[42]  Laurent Excoffier,et al.  Molecular analysis reveals tighter social regulation of immigration in patrilocal populations than in matrilocal populations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Michael J. Hickerson,et al.  msBayes: Pipeline for testing comparative phylogeographic histories using hierarchical approximate Bayesian computation , 2007, BMC Bioinformatics.

[44]  C. Simulating Probability Distributions in the Coalescent * , 2022 .