Spatially explicit Bayesian clustering models in population genetics

This article reviews recent developments in Bayesian algorithms that explicitly include geographical information in the inference of population structure. Current models substantially differ in their prior distributions and background assumptions, falling into two broad categories: models with or without admixture. To aid users of this new generation of spatially explicit programs, we clarify the assumptions underlying the models, and we test these models in situations where their assumptions are not met. We show that models without admixture are not robust to the inclusion of admixed individuals in the sample, thus providing an incorrect assessment of population genetic structure in many cases. In contrast, admixture models are robust to an absence of admixture in the sample. We also give statistical and conceptual reasons why data should be explored using spatially explicit models that include admixture.

[1]  Oscar Gaggiotti,et al.  Identifying the Environmental Factors That Determine the Genetic Structure of Populations , 2006, Genetics.

[2]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[3]  Bradley C Fedy,et al.  Genetic and ecological data provide incongruent interpretations of population structure and dispersal in naturally subdivided populations of white‐tailed ptarmigan (Lagopus leucura) , 2008, Molecular ecology.

[4]  Olivier François,et al.  fastruct: model‐based clustering made faster , 2006 .

[5]  K. Zamudio,et al.  Delayed genetic effects of habitat fragmentation on the ecologically specialized Florida sand skink (Plestiodon reynoldsi) , 2009, Conservation Genetics.

[6]  Noah A. Rosenberg,et al.  Demographic History of European Populations of Arabidopsis thaliana , 2008, PLoS genetics.

[7]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: dominant markers and null alleles , 2007, Molecular ecology notes.

[8]  Sophie Ancelet,et al.  Bayesian Clustering Using Hidden Markov Random Fields in Spatial Population Genetics , 2006, Genetics.

[9]  J. Corander,et al.  Bayesian identification of admixture events using multilocus molecular markers , 2006, Molecular ecology.

[10]  P. Donnelly,et al.  Association mapping in structured populations. , 2000, American journal of human genetics.

[11]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[12]  S. Bensch,et al.  Speciation by Distance in a Ring Species , 2005, Science.

[13]  M. Stephens,et al.  Inferring weak population structure with the assistance of sample group information , 2009, Molecular ecology resources.

[14]  Peter Clifford,et al.  Markov Random Fields in Statistics , 2012 .

[15]  R. Ward,et al.  The genetic structure of a tribal population, the Yanomama Indians. XIV. Clines and their interpretation. , 1976, Genetics.

[16]  B. Epperson,et al.  Measurement of genetic structure within populations using Moran's spatial autocorrelation statistics. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[17]  K J Dawson,et al.  A Bayesian approach to the identification of panmictic populations and the assignment of individuals. , 2001, Genetical research.

[18]  S. Manel,et al.  Statistical analysis of amplified fragment length polymorphism data: a toolbox for molecular ecologists and evolutionists , 2007, Molecular ecology.

[19]  D. Hartl,et al.  Principles of population genetics , 1981 .

[20]  K. Lange Reconstruction of Evolutionary Trees , 1997 .

[21]  G. Malécot,et al.  Les mathématiques de l'hérédité , 1948 .

[22]  Jukka Corander,et al.  Bayesian spatial modeling of genetic population structure , 2008, Comput. Stat..

[23]  N. Risch,et al.  Estimation of individual admixture: Analytical and study design considerations , 2005, Genetic epidemiology.

[24]  P. Waldmann,et al.  Joint analysis of spatial genetic structure and inbreeding in a managed population of Scots pine , 2009, Heredity.

[25]  John C. Avise Molecular Markers, Natural History and Evolution , 1994, Springer US.

[26]  J. Endler Geographic variation, speciation, and clines. , 1977, Monographs in population biology.

[27]  I. Lovette Molecular Markers, Natural History, and Evolution. 2nd edition , 2004 .

[28]  Yu Zhang Tree-guided Bayesian inference of population structures , 2008, Bioinform..

[29]  Flora Jay,et al.  Spatial inference of admixture proportions and secondary contact zones. , 2009, Molecular biology and evolution.

[30]  T Jombart,et al.  Revealing cryptic spatial patterns in genetic variability by a new multivariate method , 2008, Heredity.

[31]  J. Pella,et al.  The Gibbs and splitmerge sampler for population mixture analysis from genetic data with incomplete baselines , 2006 .

[32]  G. Evanno,et al.  Detecting the number of clusters of individuals using the software structure: a simulation study , 2005, Molecular ecology.

[33]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[34]  Laurent Excoffier,et al.  splatche: a program to simulate genetic diversity taking into account environmental heterogeneity , 2004 .

[35]  Martin Lascoux,et al.  Cryptic population genetic structure: the number of inferred clusters depends on sample size , 2010, Molecular ecology resources.

[36]  S WRIGHT,et al.  Genetical Structure of Populations , 1950, British medical journal.

[37]  S. Keller,et al.  Genomic diversity, population structure, and migration following rapid range expansion in the Balsam Poplar, Populus balsamifera , 2010, Molecular ecology.

[38]  Guha Dharmarajan,et al.  Relative performance of Bayesian clustering software for inferring population substructure and individual assignment at low levels of population differentiation , 2006, Conservation Genetics.

[39]  David H. Alexander,et al.  Fast model-based estimation of ancestry in unrelated individuals. , 2009, Genome research.

[40]  Kevin S. McKelvey,et al.  Why sampling scheme matters: the effect of sampling scheme on landscape genetic results , 2009, Conservation Genetics.

[41]  D. F. Roberts,et al.  The History and Geography of Human Genes , 1996 .

[42]  E. Xing,et al.  mStruct: Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations , 2009, Genetics.

[43]  Jeffrey C. Long,et al.  Matrix correlation analysis in anthropology and genetics , 1992 .

[44]  M. Kimura,et al.  The Stepping Stone Model of Population Structure and the Decrease of Genetic Correlation with Distance. , 1964, Genetics.

[45]  J. Ovenden,et al.  IUCN classification zones concord with, but underestimate, the population genetic structure of the zebra shark Stegostoma fasciatum in the Indo‐West Pacific , 2009, Molecular ecology.

[46]  M. Kawata,et al.  Genetic and acoustic population structuring in the Okinawa least horseshoe bat: are intercolony acoustic differences maintained by vertical maternal transmission? , 2008, Molecular ecology.

[47]  B. Pond,et al.  Differential permeability of rivers to raccoon gene flow corresponds to rabies incidence in Ontario, Canada , 2008, Molecular ecology.

[48]  Carlos D Bustamante,et al.  A Markov Chain Monte Carlo Approach for Joint Inference of Population Structure and Inbreeding Rates From Multilocus Genotype Data , 2007, Genetics.

[49]  D. McCullough,et al.  In situ population structure and ex situ representation of the endangered Amur tiger , 2009, Molecular ecology.

[50]  Jeremy T. Fineman,et al.  Reconstruction of Evolutionary Trees , 2011, Encyclopedia of Parallel Computing.

[51]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[52]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[53]  A. Storfer,et al.  Landscape genetic structure of coastal tailed frogs (Ascaphus truei) in protected vs. managed forests , 2008, Molecular ecology.

[54]  Montgomery Slatkin,et al.  ISOLATION BY DISTANCE IN EQUILIBRIUM AND NON‐EQUILIBRIUM POPULATIONS , 1993, Evolution; international journal of organic evolution.

[55]  S. Åkesson,et al.  Genetic, morphological, and feather isotope variation of migratory willow warblers show gradual divergence in a ring , 2009, Molecular ecology.

[56]  G. Hewitt The genetic legacy of the Quaternary ice ages , 2000, Nature.

[57]  O. Gaggiotti,et al.  INVITED REVIEW: What is a population? An empirical evaluation of some genetic methods for identifying the number of gene pools and their degree of connectivity , 2006, Molecular ecology.

[58]  S. Tweddale,et al.  Habitat fragmentation and genetic diversity of an endangered, migratory songbird, the golden‐cheeked warbler (Dendroica chrysoparia) , 2008, Molecular ecology.

[59]  Nicolas Ray,et al.  Principal component analysis under population genetic models of range expansion and admixture. , 2010, Molecular biology and evolution.

[60]  Olivier François,et al.  Bayesian clustering algorithms ascertaining spatial population structure: a new computer program and a comparison study , 2007 .

[61]  M. Sillanpää,et al.  Bayesian analysis of genetic differentiation between populations. , 2003, Genetics.

[62]  J. Höglund,et al.  Inference of hazel grouse population structure using multilocus data: a landscape genetic approach , 2008, Heredity.

[63]  Mattias Jakobsson,et al.  The Pattern of Polymorphism in Arabidopsis thaliana , 2005, PLoS biology.

[64]  D. Balding,et al.  A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity , 2005, Genetica.

[65]  L. Excoffier,et al.  Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. , 1992, Genetics.

[66]  J. Corander,et al.  Genetic spatial structure in a butterfly metapopulation correlates better with past than present demographic structure , 2008, Molecular ecology.

[67]  S. Wright,et al.  Isolation by Distance. , 1943, Genetics.

[68]  S. Wright,et al.  An Analysis of Local Variability of Flower Color in Linanthus Parryae. , 1943, Genetics.

[69]  AURÉLIE COULON,et al.  Statistical methods in spatial genetics , 2009, Molecular ecology.

[70]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. , 2003, Genetics.

[71]  Jukka Corander,et al.  Bayesian analysis of population structure based on linked molecular information. , 2007, Mathematical biosciences.

[72]  J. Neel The genetic structure of a tribal population, the Yanomama Indians , 1972, American journal of human genetics.

[73]  Pierre Taberlet,et al.  Landscape genetics: combining landscape ecology and population genetics , 2003 .

[74]  G. McVean A Genealogical Interpretation of Principal Components Analysis , 2009, PLoS genetics.

[75]  Nianjun Liu,et al.  PSMIX: an R package for population structure inference via maximum likelihood method , 2006, BMC Bioinformatics.

[76]  B. Rannala,et al.  The Bayesian revolution in genetics , 2004, Nature Reviews Genetics.

[77]  H. Akaike A new look at the statistical model identification , 1974 .

[78]  R. Cann The history and geography of human genes , 1995, The Journal of Asian Studies.

[79]  R. Ward The genetic structure of a tribal population, the Yanomama Indians V. Comparisons of a series of genetic networks *, † , 1972, Annals of human genetics.

[80]  Arnaud Estoup,et al.  A Spatial Statistical Model for Landscape Genetics , 2005, Genetics.

[81]  M. Kreitman,et al.  Molecular analysis of an allozyme cline: alcohol dehydrogenase in Drosophila melanogaster on the east coast of North America. , 1993, Genetics.

[82]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[83]  L. Chikhi,et al.  Non-invasive conservation genetics of the critically endangered golden-crowned sifaka (Propithecus tattersalli): high diversity and significant genetic differentiation over a small range , 2010, Conservation Genetics.

[84]  Pierre Faubet,et al.  A New Bayesian Method to Identify the Environmental Factors That Influence Recent Migration , 2008, Genetics.

[85]  G. Glass,et al.  Commensal ecology, urban landscapes, and their influence on the genetic characteristics of city‐dwelling Norway rats (Rattus norvegicus) , 2009, Molecular ecology.

[86]  C. Hoggart,et al.  Design and analysis of admixture mapping studies. , 2004, American journal of human genetics.

[87]  G. Athrey,et al.  Population structure in an endangered songbird: maintenance of genetic differentiation despite high vagility and significant population recovery , 2008, Molecular ecology.

[88]  Noah A. Rosenberg,et al.  CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure , 2007, Bioinform..

[89]  Jody Hey,et al.  Principles of population genetics (2nd edn) , 1989 .

[90]  C. Ramírez,et al.  Landscape composition modulates population genetic structure of Eriosoma lanigerum (Hausmann) on Malus domestica Borkh in central Chile. , 2009, Bulletin of entomological research.

[91]  M. Pfenninger,et al.  A species delimitation approach in the Trochulus sericeus/hispidus complex reveals two cryptic species within a sharp contact zone , 2009, BMC Evolutionary Biology.

[92]  J. Huelsenbeck,et al.  Inference of Population Structure Under a Dirichlet Process Model , 2007, Genetics.

[93]  N. Barton,et al.  Analysis of Hybrid Zones , 1985 .