Inference on population history and model checking using DNA sequence and microsatellite data with the software DIYABC (v1.0)

BackgroundApproximate Bayesian computation (ABC) is a recent flexible class of Monte-Carlo algorithms increasingly used to make model-based inference on complex evolutionary scenarios that have acted on natural populations. The software DIYABC offers a user-friendly interface allowing non-expert users to consider population histories involving any combination of population divergences, admixtures and population size changes. We here describe and illustrate new developments of this software that mainly include (i) inference from DNA sequence data in addition or separately to microsatellite data, (ii) the possibility to analyze five categories of loci considering balanced or non balanced sex ratios: autosomal diploid, autosomal haploid, X-linked, Y-linked and mitochondrial, and (iii) the possibility to perform model checking computation to assess the "goodness-of-fit" of a model, a feature of ABC analysis that has been so far neglected.ResultsWe used controlled simulated data sets generated under evolutionary scenarios involving various divergence and admixture events to evaluate the effect of mixing autosomal microsatellite, mtDNA and/or nuclear autosomal DNA sequence data on inferences. This evaluation included the comparison of competing scenarios and the quantification of their relative support, and the estimation of parameter posterior distributions under a given scenario. We also considered a set of scenarios often compared when making ABC inferences on the routes of introduction of invasive species to illustrate the interest of the new model checking option of DIYABC to assess model misfit.ConclusionsOur new developments of the integrated software DIYABC should be particularly useful to make inference on complex evolutionary scenarios involving both recent and ancient historical events and using various types of molecular markers in diploid or haploid organisms. They offer a handy way for non-expert users to achieve model checking computation within an ABC framework, hence filling up a gap of ABC analysis. The software DIYABC V1.0 is freely available at http://www1.montpellier.inra.fr/CBGP/diyabc.

[1]  Jean-Marie Hombert,et al.  Origins and Genetic Diversity of Pygmy Hunter-Gatherers from Western Central Africa , 2009, Current Biology.

[2]  Pär K Ingvarsson,et al.  Multilocus Patterns of Nucleotide Polymorphism and the Demographic History of Populus tremula , 2008, Genetics.

[3]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[4]  O. François,et al.  Approximate Bayesian Computation (ABC) in practice. , 2010, Trends in ecology & evolution.

[5]  B. Rannala,et al.  The Bayesian revolution in genetics , 2004, Nature Reviews Genetics.

[6]  M. Nei Molecular Evolutionary Genetics , 1987 .

[7]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[8]  K. Saltonstall,et al.  Cryptic invasion by a non-native genotype of the common reed, Phragmites australis, into North America , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[9]  G. Bertorelle,et al.  ABC as a flexible framework to estimate demography over space and time: some cons, many pros , 2010, Molecular ecology.

[10]  L. Excoffier,et al.  Statistical evaluation of alternative models of human evolution , 2007, Proceedings of the National Academy of Sciences.

[11]  N Takezaki,et al.  Genetic distances and reconstruction of phylogenetic trees from microsatellite DNA. , 1996, Genetics.

[12]  Jean-Marie Hombert,et al.  Inferring the Demographic History of African Farmers and Pygmy Hunter–Gatherers Using a Multilocus Resequencing Data Set , 2009, PLoS genetics.

[13]  R. Huey,et al.  Introduction history of Drosophila subobscura in the New World: a microsatellite‐based survey using ABC methods , 2007, Molecular ecology.

[14]  Michael J. Hickerson,et al.  A MULTILOCUS PERSPECTIVE ON COLONIZATION ACCOMPANIED BY SELECTION AND GENE FLOW , 2007, Evolution; international journal of organic evolution.

[15]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[16]  Daniel Wegmann,et al.  Bayesian Computation and Model Selection Without Likelihoods , 2010, Genetics.

[17]  Natalie M. Myres,et al.  Distinctive Paleo-Indian Migration Routes from Beringia Marked by Two Rare mtDNA Haplogroups , 2009, Current Biology.

[18]  L. Excoffier,et al.  Computer programs for population genetics data analysis: a survival guide , 2006, Nature Reviews Genetics.

[19]  Christophe Andrieu,et al.  Model criticism based on likelihood-free inference, with an application to protein network evolution , 2009, Proceedings of the National Academy of Sciences.

[20]  L. Jin,et al.  Estimation of genetic distance and coefficient of gene diversity from single-probe multilocus DNA fingerprinting data. , 1994, Molecular biology and evolution.

[21]  Nicolas Ray,et al.  Colonization history of the Swiss Rhine basin by the bullhead (Cottus gobio): inference under a Bayesian spatially explicit framework , 2008, Molecular ecology.

[22]  J. Cornuet,et al.  Estimating admixture proportions with microsatellites: comparison of methods based on simulated data , 2004, Molecular ecology.

[23]  BMC Bioinformatics , 2005 .

[24]  Jean-Marie Cornuet,et al.  GENETIC ANALYSIS OF COMPLEX DEMOGRAPHIC SCENARIOS: SPATIALLY EXPANDING POPULATIONS OF THE CANE TOAD, BUFO MARINUS , 2004, Evolution; international journal of organic evolution.

[25]  C. Meyer,et al.  Testing comparative phylogeographic models of marine vicariance and dispersal using a hierarchical Bayesian approach , 2008, BMC Evolutionary Biology.

[26]  M W Feldman,et al.  An evaluation of genetic distances for use with microsatellite loci. , 1994, Genetics.

[27]  Joanna L. Mountain,et al.  REJECTOR: software for population history inference from genetic data via a rejection algorithm , 2008, Bioinform..

[28]  Joe Roman,et al.  Diluting the founder effect: cryptic invasions expand a marine invader's range , 2006, Proceedings of the Royal Society B: Biological Sciences.

[29]  S. Jeffery Evolution of Protein Molecules , 1979 .

[30]  Mattias Jakobsson,et al.  A unique recent origin of the allotetraploid species Arabidopsis suecica: Evidence from nuclear DNA markers. , 2006, Molecular biology and evolution.

[31]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[32]  Joao S. Lopes,et al.  PopABC: a program to infer historical demographic parameters , 2009, Bioinform..

[33]  Jean-Marie Cornuet,et al.  Bridgehead Effect in the Worldwide Invasion of the Biocontrol Harlequin Ladybird , 2010, PloS one.

[34]  Noah A Rosenberg,et al.  Gene tree discordance, phylogenetic inference and the multispecies coalescent. , 2009, Trends in ecology & evolution.

[35]  H. Munro,et al.  Mammalian protein metabolism , 1964 .

[36]  G. Luikart,et al.  COMPUTER PROGRAMS: onesamp: a program to estimate effective population size using approximate Bayesian computation , 2008, Molecular ecology resources.

[37]  Arnaud Estoup,et al.  Homoplasy and mutation model at microsatellite loci and their consequences for population genetics analysis , 2002, Molecular ecology.

[38]  M. Nei,et al.  Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. , 1993, Molecular biology and evolution.

[39]  B. Weir,et al.  ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE , 1984, Evolution; international journal of organic evolution.

[40]  Arnaud Estoup,et al.  Multiple Transatlantic Introductions of the Western Corn Rootworm , 2005, Science.

[41]  M. Slatkin,et al.  Estimation of levels of gene flow from DNA sequence data. , 1992, Genetics.

[42]  M W Feldman,et al.  Microsatellite behavior with range constraints: parameter estimation and improved distances for use in phylogenetic reconstruction. , 1998, Theoretical population biology.

[43]  Laurent Excoffier,et al.  ABCtoolbox: a versatile toolkit for approximate Bayesian computations , 2010, BMC Bioinformatics.

[44]  S. Coles,et al.  Inference for Stereological Extremes , 2007 .

[45]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[46]  O. Gaggiotti,et al.  A Genome-Scan Method to Identify Selected Loci Appropriate for Both Dominant and Codominant Markers: A Bayesian Perspective , 2008, Genetics.

[47]  Jean-Marie Cornuet,et al.  Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation , 2008, Bioinform..

[48]  Guido Barbujani,et al.  Inferring genealogical processes from patterns of Bronze-Age and modern DNA variation in Sardinia. , 2010, Molecular biology and evolution.

[49]  M. Beaumont,et al.  Inferring introduction routes of invasive species using approximate Bayesian computation on microsatellite data , 2010, Heredity.

[50]  Andrew R. Francis,et al.  The epidemiological fitness cost of drug resistance in Mycobacterium tuberculosis , 2009, Proceedings of the National Academy of Sciences.

[51]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[52]  Paul Marjoram,et al.  Markov chain Monte Carlo without likelihoods , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Jukka Corander,et al.  In defence of model‐based inference in phylogeography , 2010, Molecular ecology.

[54]  M. Beaumont,et al.  ABC: a useful Bayesian tool for the analysis of population data. , 2010, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[55]  Michael Lynch,et al.  Direct Estimation of the Mitochondrial DNA Mutation Rate in Drosophila melanogaster , 2008, PLoS biology.

[56]  Koen J. F. Verhoeven,et al.  Implementing false discovery rate control: increasing your power , 2005 .

[57]  Jean-Marie Cornuet,et al.  Bayesian Analysis of an Admixture Model With Mutations and Arbitrarily Linked Markers , 2005, Genetics.