Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach

A maximum likelihood estimator based on the coalescent for unequal migration rates and different subpopulation sizes is developed. The method uses a Markov chain Monte Carlo approach to investigate possible genealogies with branch lengths and with migration events. Properties of the new method are shown by using simulated data from a four-population n-island model and a source–sink population model. Our estimation method as coded in migrate is tested against genetree; both programs deliver a very similar likelihood surface. The algorithm converges to the estimates fairly quickly, even when the Markov chain is started from unfavorable parameters. The method was used to estimate gene flow in the Nile valley by using mtDNA data from three human populations.

[1]  Jon A Yamato,et al.  Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling. , 1995, Genetics.

[2]  L. Excoffier,et al.  Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. , 1992, Genetics.

[3]  F Rousset,et al.  Equilibrium values of measures of population subdivision for stepwise mutation processes. , 1996, Genetics.

[4]  L. Excoffier,et al.  Substitution rate variation among sites in mitochondrial hypervariable region I of humans and chimpanzees. , 1999, Molecular biology and evolution.

[5]  L. Excoffier,et al.  A simple method of removing the effect of a bottleneck and unequal population sizes on pairwise genetic distances , 2000, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[6]  Sanford Weisberg,et al.  Computing science and statistics : proceedings of the 30th Symposium on the Interface, Minneapolis, Minnesota, May 13-16, 1998 : dimension reduction, computational complexity and information , 1998 .

[7]  LIKELIHOOD ANALYSIS OF ONGOING GENE FLOW AND HISTORICAL ASSOCIATION , 2000, Evolution; international journal of organic evolution.

[8]  S. Wright Evolution in mendelian populations , 1931 .

[9]  B. Weir Genetic Data Analysis II. , 1997 .

[10]  J. Felsenstein,et al.  Maximum-likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach. , 1999, Genetics.

[11]  Arndt von Haeseler,et al.  Compilation of human mtDNA control region sequences , 1998, Nucleic Acids Res..

[12]  S. Wright,et al.  The Theoretical Variance within and among Subdivisions of a Population That Is in a Steady State. , 1952, Genetics.

[13]  F. Ayala Molecular systematics , 2004, Journal of Molecular Evolution.

[14]  S. Chib,et al.  Understanding the Metropolis-Hastings Algorithm , 1995 .

[15]  M Slatkin,et al.  A measure of population subdivision based on microsatellite allele frequencies. , 1995, Genetics.

[16]  M. Slatkin,et al.  Estimation of levels of gene flow from DNA sequence data. , 1992, Genetics.

[17]  Jon A Yamato,et al.  Maximum likelihood estimation of recombination rates from population data. , 2000, Genetics.

[18]  A. Di Rienzo,et al.  mtDNA analysis of Nile River Valley populations: A genetic corridor or a barrier to migration? , 1999, American journal of human genetics.

[19]  S. Engen,et al.  Inferring patterns of migration from gene frequencies under equilibrium conditions. , 1996, Genetics.

[20]  G. Spinelli,et al.  High molecular weight RNA containing histone messenger in the sea urchin Paracentrotus lividus. , 1980, Journal of molecular biology.

[21]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[22]  Jon A Yamato,et al.  Usefulness of single nucleotide polymorphism data for estimating population parameters. , 2000, Genetics.

[23]  R. Griffiths,et al.  Inference from gene trees in a subdivided population. , 2000, Theoretical population biology.

[24]  Jon A Yamato,et al.  Maximum likelihood estimation of population growth rates based on the coalescent. , 1998, Genetics.

[25]  J. Hammersley,et al.  Monte Carlo Methods , 1965 .

[26]  D. Sankoff Minimal Mutation Trees of Sequences , 1975 .

[27]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[28]  Luis A. Escobar,et al.  Teaching about Approximate Confidence Regions Based on Maximum Likelihood Estimation , 1995 .

[29]  R. Hudson Properties of a neutral allele model with intragenic recombination. , 1983, Theoretical population biology.

[30]  L. Excoffier,et al.  Using allele frequencies and geographic subdivision to reconstruct gene trees within a species: molecular variance parsimony. , 1994, Genetics.

[31]  N. Morton Genetic epidemiology , 1997, International Journal of Obesity.

[32]  J. Felsenstein,et al.  A Hidden Markov Model approach to variation among sites in rate of evolution. , 1996, Molecular biology and evolution.

[33]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo in Practice: A Roundtable Discussion , 1998 .

[34]  J. Felsenstein Phylogenies from molecular sequences: inference and reliability. , 1988, Annual review of genetics.