Using Temporally Spaced Sequences to Simultaneously Estimate Migration Rates, Mutation Rate and Population Sizes in Measurably Evolving Populations

We present a Bayesian statistical inference approach for simultaneously estimating mutation rate, population sizes, and migration rates in an island-structured population, using temporal and spatial sequence data. Markov chain Monte Carlo is used to collect samples from the posterior probability distribution. We demonstrate that this chain implementation successfully reaches equilibrium and recovers truth for simulated data. A real HIV DNA sequence data set with two demes, semen and blood, is used as an example to demonstrate the method by fitting asymmetric migration rates and different population sizes. This data set exhibits a bimodal joint posterior distribution, with modes favoring different preferred migration directions. This full data set was subsequently split temporally for further analysis. Qualitative behavior of one subset was similar to the bimodal distribution observed with the full data set. The temporally split data showed significant differences in the posterior distributions and estimates of parameter values over time.

[1]  R. Griffiths,et al.  Inference from gene trees in a subdivided population. , 2000, Theoretical population biology.

[2]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[3]  M. Notohara,et al.  The coalescent and the genealogical process in geographically structured population , 1990, Journal of mathematical biology.

[4]  John Frank Charles Kingman,et al.  Stochastic Processes and Their Applications , 1982 .

[5]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[6]  P. Simmonds,et al.  Edinburgh Research Explorer Identification of shared populations of human immunodeficiency virus type 1 infecting microglia and tissue macrophages outside the central nervous system , 2022 .

[7]  Peter Beerli,et al.  Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  J. Felsenstein,et al.  Maximum-likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach. , 1999, Genetics.

[9]  D. Ho,et al.  Compartmentalization of Surface Envelope Glycoprotein of Human Immunodeficiency Virus Type 1 during Acute and Chronic Infection , 2002, Journal of Virology.

[10]  R. Nielsen,et al.  Distinguishing migration from isolation: a Markov chain Monte Carlo approach. , 2001, Genetics.

[11]  J. Oliver,et al.  The general stochastic model of nucleotide substitution. , 1990, Journal of theoretical biology.

[12]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[13]  K. Crandall The evolution of HIV , 1999 .

[14]  J. Kingman On the genealogy of large populations , 1982, Journal of Applied Probability.

[15]  Mary Poss,et al.  Evolution of Envelope Sequences from the Genital Tract and Peripheral Blood of Women Infected with Clade A Human Immunodeficiency Virus Type 1 , 1998, Journal of Virology.

[16]  D. Nickle,et al.  Evolutionary Indicators of Human Immunodeficiency Virus Type 1 Reservoirs and Compartments , 2003, Journal of Virology.

[17]  G. Nicholls,et al.  Genealogies from Time-Stamped Sequence Data , 2004 .

[18]  J. Margolick,et al.  Consistent Viral Evolutionary Changes Associated with the Progression of Human Immunodeficiency Virus Type 1 Infection , 1999, Journal of Virology.

[19]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[20]  R. Punnett,et al.  The Genetical Theory of Natural Selection , 1930, Nature.

[21]  M. Notohara,et al.  The strong-migration limit for the genealogical process in geographically structured populations , 1993 .

[22]  A. Rodrigo,et al.  The inference of stepwise changes in substitution rates using serial sequence samples. , 2001, Molecular biology and evolution.

[23]  R. Hudson Gene genealogies and the coalescent process. , 1990 .

[24]  A. Rodrigo,et al.  Reconstructing genealogies of serial samples under the assumption of a molecular clock using serial-sample UPGMA. , 2000, Molecular biology and evolution.

[25]  D. Richman,et al.  In vivo compartmentalization of human immunodeficiency virus: evidence from the examination of pol sequences from autopsy tissues , 1997, Journal of virology.

[26]  Alexei J Drummond,et al.  Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. , 2002, Genetics.

[27]  S. Wright,et al.  Evolution in Mendelian Populations. , 1931, Genetics.

[28]  A. Rodrigo,et al.  Measurably evolving populations , 2003 .

[29]  Caitlin E. Buck,et al.  Tools for Constructing Chronologies , 2004 .

[30]  Peter Green,et al.  Highly Structured Stochastic Systems , 2003 .

[31]  A. Perelson,et al.  HIV-1 Dynamics in Vivo: Virion Clearance Rate, Infected Cell Life-Span, and Viral Generation Time , 1996, Science.

[32]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .