Inference with selection, varying population size, and evolving population structure: application of ABC to a forward–backward coalescent process with interactions

Genetic data are often used to infer history, demographic changes or detect genes under selection. Inferential methods are commonly based on models making various strong assumptions: demography and population structures are supposed a priori known, the evolution of the genetic composition of a population does not affect demography nor population structure, and there is no selection nor interaction between and within genetic strains. In this paper, we present a stochastic birth-death model with competitive interaction to describe an asexual population, and we develop an inferential procedure for ecological, demographic and genetic parameters. We first show how genetic diversity and genealogies are related to birth and death rates, and to how individuals compete within and between strains. This leads us to propose an original model of phylogenies, with trait structure and interactions, that allows multiple merging. Second, we develop an Approximate Bayesian Computation framework to use our model for analyzing genetic data. We apply our procedure to simulated and real data. We show that the procedure give accurate estimate of the parameters of the model. We finally carry an illustration on real data and analyze the genetic diversity of microsatellites on Y-chromosomes sampled from Central Asia populations in order to test whether different social organizations show significantly different fertility.

[1]  Viet Chi Tran,et al.  HIV with contact tracing: a case study in approximate Bayesian computation. , 2008, Biostatistics.

[2]  Pierre Pudlo,et al.  Adaptive ABC model choice and geometric summary statistics for hidden Gibbs random fields , 2014, Statistics and Computing.

[3]  N. Barton,et al.  Genetic hitchhiking. , 2000, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[4]  David A. Rasmussen,et al.  Coupling adaptive molecular evolution to phylodynamics using fitness-dependent birth-death models , 2019, bioRxiv.

[5]  Nicolas Champagnat A microscopic interpretation for adaptive dynamics trait substitution sequence models , 2005, math/0512063.

[6]  Trevor Bedford,et al.  nextflu: real-time tracking of seasonal influenza virus evolution in humans , 2015, Bioinform..

[7]  Cody T. Ross,et al.  Evidence for quantity–quality trade-offs, sex-specific parental investment, and variance compensation in colonized Agta foragers undergoing demographic transition , 2016 .

[8]  Jean-Marie Hombert,et al.  Origins and Genetic Diversity of Pygmy Hunter-Gatherers from Western Central Africa , 2009, Current Biology.

[9]  D. Goldstein,et al.  Human migrations and population structure: what we know and why it matters. , 2002, Annual review of genomics and human genetics.

[10]  W. Stephan Signatures of positive selection: from selective sweeps at individual loci to subtle allele frequency changes in polygenic adaptation , 2016, Molecular ecology.

[11]  M. Lässig,et al.  Clonal Interference in the Evolution of Influenza , 2012, Genetics.

[12]  J. Pitman Coalescents with multiple collisions , 1999 .

[13]  W. Li,et al.  Statistical tests of neutrality of mutations. , 1993, Genetics.

[14]  J. Crow,et al.  Anecdotal, Historical and Critical Commentaries on Genetics , 1994 .

[15]  G. Jasienska,et al.  The fertility of agricultural and non-agricultural traditional societies. , 1993 .

[16]  A. Lambert,et al.  Coagulation-transport equations and the nested coalescents , 2018, Probability Theory and Related Fields.

[17]  J. Metz,et al.  Adaptive Dynamics: A Geometrical Study of the Consequences of Nearly Faithful Reproduction , 1995 .

[18]  Olivier François,et al.  Non-linear regression models for Approximate Bayesian Computation , 2008, Stat. Comput..

[19]  T. Nagylaki Gustave Malécot and the transition from classical to modern population genetics. , 1989, Genetics.

[20]  M. Kimura,et al.  An introduction to population genetics theory , 1971 .

[21]  Kirk E. Lohmueller,et al.  Using Genomic Data to Infer Historic Population Dynamics of Nonmodel Organisms , 2018, Annual Review of Ecology, Evolution, and Systematics.

[22]  S. Jansen,et al.  On the notion(s) of duality for Markov processes , 2012, 1210.7193.

[23]  Peter Donnelly,et al.  Genealogical processes for Fleming-Viot models with selection and recombination , 1999 .

[24]  Trevor Bedford,et al.  Strength and tempo of selection revealed in viral gene genealogies , 2011, BMC Evolutionary Biology.

[25]  Paul Fearnhead,et al.  Semi-automatic selection of summary statistics for ABC model choice , 2013, Statistical applications in genetics and molecular biology.

[26]  J. Roughgarden Theory of Population Genetics and Evolutionary Ecology: An Introduction , 1995 .

[27]  Nicholas H. Barton,et al.  The effect of hitch-hiking on neutral genealogies , 1998 .

[28]  Trevor Bedford,et al.  Eight challenges in phylodynamic inference , 2015, Epidemics.

[29]  S. Ethier,et al.  Markov Processes: Characterization and Convergence , 2005 .

[30]  C. Robert,et al.  ABC likelihood-free methods for model choice in Gibbs random fields , 2008, 0807.2767.

[31]  Donald A. Dawson,et al.  Measure-valued Markov processes , 1993 .

[32]  M. Blum Approximate Bayesian Computation: A Nonparametric Perspective , 2009, 0904.0635.

[33]  Nicolas Champagnat,et al.  Polymorphic evolution sequence and evolutionary branching , 2008, 0812.1655.

[34]  J. Kingman On the genealogy of large populations , 1982 .

[35]  Jean Clobert,et al.  ULM, a software for conservation and evolutionary biologists , 1995 .

[36]  Roland Meizis Convergence of metric two-level measure spaces , 2018, Stochastic Processes and their Applications.

[37]  M. Hammer,et al.  From Social to Genetic Structures in Central Asia , 2007, Current Biology.

[38]  Olivier François,et al.  On statistical tests of phylogenetic tree imbalance: the Sackin and other indices revisited. , 2005, Mathematical biosciences.

[39]  P. Balaresque,et al.  Patrilineal populations show more male transmission of reproductive success than cognatic populations in Central Asia, which reduces their genetic diversity. , 2015, American journal of physical anthropology.

[40]  Coupling adaptive molecular evolution to phylodynamics using fitness-dependent birth-death models , 2019, eLife.

[41]  Richard Durrett,et al.  Approximating selective sweeps. , 2004, Theoretical population biology.

[42]  Nicolas Champagnat,et al.  Convergence to equilibrium in competitive Lotka–Volterra and chemostat systems , 2010 .

[43]  Jean-Michel Marin,et al.  Approximate Bayesian computational methods , 2011, Statistics and Computing.

[44]  Tanja Stadler,et al.  The Structured Coalescent and Its Approximations , 2016, bioRxiv.

[45]  Laure Gallien,et al.  Intransitive competition and its effects on community functional diversity , 2017 .

[46]  Jean-Jil Duchamps Trees within trees II: Nested fragmentations , 2018, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques.

[47]  D. Balding,et al.  Approximate Bayesian computation in population genetics. , 2002, Genetics.

[48]  Mary Lou Zeeman,et al.  Hopf bifurcations in competitive three-dimensional Lotka-Volterra Systems , 1993 .

[49]  U. Dieckmann,et al.  On the origin of species by sympatric speciation , 1999, Nature.

[50]  Philipp W. Messer,et al.  SLiM 3: Forward Genetic Simulations Beyond the Wright–Fisher Model , 2018, bioRxiv.

[51]  Nicolas Fournier,et al.  A microscopic probabilistic description of a locally regulated population and macroscopic approximations , 2004, math/0503546.

[52]  Joseph Fourier,et al.  Approximate Bayesian Computation: a non-parametric perspective , 2013 .

[53]  Amaury Lambert,et al.  Trees within trees: simple nested coalescents , 2018, 1803.02133.

[54]  R. Ferrière,et al.  Unifying evolutionary dynamics: from individual stochastic processes to macroscopic models. , 2006, Theoretical population biology.

[55]  Nicolas Champagnat,et al.  Invasion and adaptive evolution for individual-based spatially structured populations , 2006, Journal of mathematical biology.

[56]  A. Etheridge,et al.  An introduction to superprocesses , 2000 .

[57]  Katalin Csill'ery,et al.  abc: an R package for approximate Bayesian computation (ABC) , 2011, 1106.2793.

[58]  R. Durrett,et al.  Random partitions approximating the coalescence of lineages during a selective sweep , 2004, math/0411069.

[59]  Laurent Excoffier,et al.  The impact of purifying and background selection on the inference of population history: problems and prospects , 2020, bioRxiv.

[60]  Kenneth J. Hochberg,et al.  Wandering Random Measures in the Fleming-Viot Model , 1982 .

[61]  Nicholas H. Barton,et al.  The Effects of Genetic and Geographic Structure on Neutral Variation , 2003 .

[62]  A. Wakolbinger,et al.  An approximate sampling formula under genetic hitchhiking , 2005, math/0503485.

[63]  Peter Donnelly,et al.  Particle Representations for Measure-Valued Population Models , 1999 .

[64]  V. Tran,et al.  Stochastic dynamics of adaptive trait and neutral marker driven by eco-evolutionary feedbacks , 2013, Journal of mathematical biology.

[65]  Serik Sagitov,et al.  The general coalescent with asynchronous mergers of ancestral lines , 1999 .

[66]  R. Mace,et al.  Fertility and Mode of Subsistence: A Phylogenetic Analysis1 , 1997, Current Anthropology.

[67]  Gilles Celeux,et al.  Approximate Bayesian computation methods , 2012, Statistics and Computing.

[68]  Jean-Marie Cornuet,et al.  ABC model choice via random forests , 2014, 1406.6288.

[69]  Thibault Nidelet,et al.  Niche-driven evolution of metabolic and life-history strategies in natural and domesticated populations of Saccharomyces cerevisiae , 2009, BMC Evolutionary Biology.

[70]  S. Janson,et al.  On the Total External Length of the Kingman Coalescent , 2011 .

[71]  Peter Donnelly,et al.  A countable representation of the Fleming-Viot measure-valued diffusion , 1996 .