Distinguishing between population bottleneck and population subdivision by a Bayesian model choice procedure

Although most natural populations are genetically subdivided, they are often analysed as if they were panmictic units. In particular, signals of past demographic size changes are often inferred from genetic data by assuming that the analysed sample is drawn from a population without any internal subdivision. However, it has been shown that a bottleneck signal can result from the presence of some recent immigrants in a population. It thus appears important to contrast these two alternative scenarios in a model choice procedure to prevent wrong conclusions to be made. We use here an Approximate Bayesian Computation (ABC) approach to infer whether observed patterns of genetic diversity in a given sample are more compatible with it being drawn from a panmictic population having gone through some size change, or from one or several demes belonging to a recent finite island model. Simulations show that we can correctly identify samples drawn from a subdivided population in up to 95% of the cases for a wide range of parameters. We apply our model choice procedure to the case of the chimpanzee (Pan troglodytes) and find conclusive evidence that Western and Eastern chimpanzee samples are drawn from a spatially subdivided population.

[1]  Jean-Marie Cornuet,et al.  Bayesian Analysis of an Admixture Model With Mutations and Arbitrarily Linked Markers , 2005, Genetics.

[2]  David Reich,et al.  Genetic Structure of Chimpanzee Populations , 2007, PLoS genetics.

[3]  A. Bauer,et al.  Molecular phylogeny of the scincid lizards of New Caledonia and adjacent areas: evidence for a single origin of the endemic skinks of Tasmantis. , 2007, Molecular phylogenetics and evolution.

[4]  S. Liu-Cordero,et al.  The discovery of single-nucleotide polymorphisms--and inferences about human demographic history. , 2001, American journal of human genetics.

[5]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[6]  H. Ellegren Microsatellites: simple sequences with complex evolution , 2004, Nature Reviews Genetics.

[7]  Nicolas Ray,et al.  Bayesian Estimation of Recent Migration Rates After a Spatial Expansion , 2005, Genetics.

[8]  J. Huelsenbeck,et al.  Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo. , 2004, Molecular biology and evolution.

[9]  M. Beaumont Detecting population expansion and decline using microsatellites. , 1999, Genetics.

[10]  Noah A. Rosenberg,et al.  Demographic History of European Populations of Arabidopsis thaliana , 2008, PLoS genetics.

[11]  J. Wakeley Coalescent Theory: An Introduction , 2008 .

[12]  P Besbeas,et al.  Integrating Mark–Recapture–Recovery and Census Data to Estimate Animal Abundance and Demographic Parameters , 2002, Biometrics.

[13]  Mark M. Tanaka,et al.  Sequential Monte Carlo without likelihoods , 2007, Proceedings of the National Academy of Sciences.

[14]  Molly Przeworski,et al.  Evidence for population growth in humans is confounded by fine-scale population structure. , 2002, Trends in genetics : TIG.

[15]  R. Nielsen,et al.  Distinguishing migration from isolation: a Markov chain Monte Carlo approach. , 2001, Genetics.

[16]  A. Zellner An Introduction to Bayesian Inference in Econometrics , 1971 .

[17]  J. Wakeley,et al.  Nonequilibrium migration in human history. , 1999, Genetics.

[18]  A. Estoup,et al.  Genetics of recent habitat contraction and reduction in population size: does isolation by distance matter? , 2006, Molecular ecology.

[19]  K. Bjorndal,et al.  Historical Overfishing and the Recent Collapse of Coastal Ecosystems , 2001, Science.

[20]  P. Donnelly,et al.  Inferring coalescence times from DNA sequence data. , 1997, Genetics.

[21]  Joanna L. Mountain,et al.  REJECTOR: software for population history inference from genetic data via a rejection algorithm , 2008, Bioinform..

[22]  C. J-F,et al.  THE COALESCENT , 1980 .

[23]  L. Joseph,et al.  Bayesian Statistics: An Introduction , 1989 .

[24]  Jon A Yamato,et al.  Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling. , 1995, Genetics.

[25]  W. Stephan,et al.  The Impact of Sampling Schemes on the Site Frequency Spectrum in Nonequilibrium Subdivided Populations , 2009, Genetics.

[26]  J. Garza,et al.  Detection of reduction in population size using data from microsatellite loci , 2001, Molecular ecology.

[27]  A. Drummond,et al.  Bayesian inference of population size history from multiple loci , 2008, BMC Evolutionary Biology.

[28]  S. Wright,et al.  Evolution in Mendelian Populations. , 1931, Genetics.

[29]  Laurent Excoffier,et al.  Bayesian inference of the demographic history of chimpanzees. , 2010, Molecular biology and evolution.

[30]  O. Pybus,et al.  An integrated framework for the inference of viral population history from reconstructed genealogies. , 2000, Genetics.

[31]  D. Reich,et al.  Analysis of Chimpanzee History Based on Genome Sequence Alignments , 2008, PLoS genetics.

[32]  W. Ewens The sampling theory of selectively neutral alleles. , 1972, Theoretical population biology.

[33]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[34]  L. Excoffier,et al.  Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows , 2010, Molecular ecology resources.

[35]  R. Nielsen,et al.  Multilocus Methods for Estimating Population Sizes, Migration Rates and Divergence Time, With Applications to the Divergence of Drosophila pseudoobscura and D. persimilis , 2004, Genetics.

[36]  Kevin R. Thornton,et al.  Approximate Bayesian Inference Reveals Evidence for a Recent, Severe Bottleneck in a Netherlands Population of Drosophila melanogaster , 2006, Genetics.

[37]  Daniel Wegmann,et al.  Bayesian Computation and Model Selection Without Likelihoods , 2010, Genetics.

[38]  O. Pybus,et al.  Bayesian coalescent inference of past population dynamics from molecular sequences. , 2005, Molecular biology and evolution.

[39]  Stefan Schneider,et al.  Arlequin (version 3.0): An integrated software package for population genetics data analysis , 2005 .

[40]  Daniel Gianola,et al.  An Introduction to Bayesian Inference , 2002 .

[41]  C. Fefferman,et al.  Can one learn history from the allelic spectrum? , 2008, Theoretical population biology.

[42]  Jody Hey,et al.  Divergence population genetics of chimpanzees. , 2004, Molecular biology and evolution.

[43]  Joseph I. Hoffman,et al.  Automated binning of microsatellite alleles: problems and solutions , 2006 .

[44]  Mark A. Beaumont,et al.  TESTING FOR GENETIC EVIDENCE OF POPULATION EXPANSION AND CONTRACTION: AN EMPIRICAL ANALYSIS OF MICROSATELLITE DNA VARIATION USING A HIERARCHICAL BAYESIAN MODEL , 2002, Evolution; international journal of organic evolution.

[45]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[46]  J M Cornuet,et al.  Description and power analysis of two tests for detecting recent population bottlenecks from allele frequency data. , 1996, Genetics.

[47]  Jean-Marie Cornuet,et al.  Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation , 2008, Bioinform..

[48]  S. Wahlund ZUSAMMENSETZUNG VON POPULATIONEN UND KORRELATIONSERSCHEINUNGEN VOM STANDPUNKT DER VERERBUNGSLEHRE AUS BETRACHTET , 2010 .

[49]  L. Excoffier,et al.  Statistical evaluation of alternative models of human evolution , 2007, Proceedings of the National Academy of Sciences.

[50]  Laurent Excoffier,et al.  Arlequin (version 3.0): An integrated software package for population genetics data analysis , 2005, Evolutionary bioinformatics online.

[51]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[52]  Laurent Excoffier,et al.  SIMCOAL 2.0: a program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history , 2004, Bioinform..

[53]  Laurent Excoffier,et al.  ABCtoolbox: a versatile toolkit for approximate Bayesian computations , 2010, BMC Bioinformatics.

[54]  J. Wakeley,et al.  Segregating sites in Wright's island model. , 1998, Theoretical population biology.

[55]  Jean-Marie Cornuet,et al.  GENETIC ANALYSIS OF COMPLEX DEMOGRAPHIC SCENARIOS: SPATIALLY EXPANDING POPULATIONS OF THE CANE TOAD, BUFO MARINUS , 2004, Evolution; international journal of organic evolution.

[56]  Yun-Xin Fu Coalescent theory for a partially selfing population. , 1997, Genetics.

[57]  Mark A Beaumont,et al.  Statistical inferences in phylogeography , 2009, Molecular ecology.

[58]  David Lindley,et al.  Introduction to Probability and Statistics from a Bayesian Viewpoint , 1966 .

[59]  R. Hudson,et al.  Interrogating multiple aspects of variation in a full resequencing data set to infer human population size changes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[60]  L. Excoffier Patterns of DNA sequence diversity and genetic structure after a range expansion: lessons from the infinite‐island model , 2004, Molecular ecology.

[61]  H. Akaike Likelihood of a model and information criteria , 1981 .

[62]  Olivier François,et al.  Non-linear regression models for Approximate Bayesian Computation , 2008, Stat. Comput..

[63]  M. Przeworski,et al.  A new approach to estimate parameters of speciation models with application to apes. , 2007, Genome research.

[64]  Mark A. Beaumont,et al.  Joint determination of topology, divergence time, and immigration in population trees , 2008 .

[65]  O. Gaggiotti,et al.  A Genome-Scan Method to Identify Selected Loci Appropriate for Both Dominant and Codominant Markers: A Bayesian Perspective , 2008, Genetics.