Visualizing spatial population structure with estimated effective migration surfaces

Genetic data often exhibit patterns that are broadly consistent with “isolation by distance” – a phenomenon where genetic similarity tends to decay with geographic distance. In a heterogeneous habitat, decay may occur more quickly in some regions than others: for example, barriers to gene flow can accelerate the genetic differentiation between groups located close in space. We use the concept of “effective migration” to model the relationship between genetics and geography: in this paradigm, effective migration is low in regions where genetic similarity decays quickly. We present a method to quantify and visualize variation in effective migration across the habitat, which can be used to identify potential barriers to gene flow, from geographically indexed large-scale genetic data. Our approach uses a population genetic model to relate underlying migration rates to expected pairwise genetic dissimilarities, and estimates migration rates by matching these expectations to the observed dissimilarities. We illustrate the potential and limitations of our method using simulations and geo-referenced genetic data from elephant, human and Arabidopsis thaliana populations. The resulting visualizations highlight important features of the spatial population structure that are difficult to discern using existing methods for summarizing genetic variation such as principal components analysis.

[1]  István Lukovits,et al.  Resistance-distance matrix: A computational algorithm and its application , 2002 .

[2]  Noah A. Rosenberg,et al.  A Quantitative Comparison of the Similarity between Genes and Geography in Worldwide Human Populations , 2012, PLoS genetics.

[3]  J. Felsenstein Maximum-likelihood estimation of evolutionary trees from continuous characters. , 1973, American journal of human genetics.

[4]  D. Falush,et al.  Inference of Population Structure using Dense Haplotype Data , 2012, PLoS genetics.

[5]  L. Waits,et al.  A new individual‐based spatial approach for identifying genetic discontinuities in natural populations , 2007, Molecular ecology.

[6]  R. Durbin,et al.  Inferring human population size and separation history from multiple genome sequences , 2014, Nature Genetics.

[7]  David H. Alexander,et al.  Fast model-based estimation of ancestry in unrelated individuals. , 2009, Genome research.

[8]  M. Feldman,et al.  Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation , 2008 .

[9]  S. O’Brien,et al.  Patterns of molecular genetic variation among African elephant populations , 2002, Molecular ecology.

[10]  Maria A Sans-Fuentes,et al.  Genome-wide patterns of gene flow across a house mouse hybrid zone. , 2007, Genome research.

[11]  B S Weir,et al.  Genetic assignment of large seizures of elephant ivory reveals Africa’s major poaching hotspots , 2015, Science.

[12]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[13]  Robert C. Griffiths,et al.  Coalescence time for two genes from a subdivided population , 2001, Journal of mathematical biology.

[14]  Arnaud Estoup,et al.  A Spatial Statistical Model for Landscape Genetics , 2005, Genetics.

[15]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[16]  M. Przeworski,et al.  A new approach to estimate parameters of speciation models with application to apes. , 2007, Genome research.

[17]  Christian Gieger,et al.  Correlation between Genetic and Geographic Structure in Europe , 2008, Current Biology.

[18]  A. Templeton,et al.  Structure and history of African elephant populations: I. Eastern and southern Africa. , 1994, The Journal of heredity.

[19]  John Novembre,et al.  Global distribution of genomic diversity underscores rich complex history of continental human populations. , 2009, Genome research.

[20]  M. Stephens,et al.  Interpreting principal component analyses of spatial population genetic variation , 2008, Nature Genetics.

[21]  Michael GB Blum,et al.  NONSTATIONARY PATTERNS OF ISOLATION-BY-DISTANCE: INFERRING MEASURES OF LOCAL GENETIC DIFFERENTIATION WITH BAYESIAN KRIGING , 2012, Evolution; international journal of organic evolution.

[22]  F. Rousset Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. , 1997, Genetics.

[23]  Eran Halperin,et al.  A model-based approach for analysis of spatial structure in genetic data , 2012, Nature Genetics.

[24]  John Novembre,et al.  The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research. , 2008, American journal of human genetics.

[25]  J. Felsenstein A Pain in the Torus: Some Difficulties with Models of Isolation by Distance , 1975, The American Naturalist.

[26]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[27]  Alkes L. Price,et al.  New approaches to population stratification in genome-wide association studies , 2010, Nature Reviews Genetics.

[28]  Amit R. Indap,et al.  Genes mirror geography within Europe , 2008, Nature.

[29]  R. Durbin,et al.  Inference of human population history from individual whole-genome sequences. , 2011, Nature.

[30]  G. McVean,et al.  Differential confounding of rare and common variants in spatially structured populations , 2011, Nature Genetics.

[31]  A. Auton,et al.  Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel , 2011, Nature Genetics.

[32]  Chaolong Wang,et al.  Inference of unexpected genetic relatedness among individuals in HapMap Phase III. , 2010, American journal of human genetics.

[33]  H. Bradley Shaffer,et al.  Evolution and Conservation , 2013 .

[34]  M. Randic,et al.  Resistance distance , 1993 .

[35]  Patrick J. Bartlein,et al.  The end of the rainbow? Color schemes for improved data graphics , 2004 .

[36]  Detlef Weigel,et al.  The Scale of Population Structure in Arabidopsis thaliana , 2010, PLoS genetics.

[37]  Noah A. Rosenberg,et al.  Geographic Sampling Scheme as a Determinant of the Major Axis of Genetic Variation in Principal Components Analysis , 2012, Molecular biology and evolution.

[38]  Ihsan A. Al-Shehbaz,et al.  A synopsis of Arabidopsis (Brassicaceae) , 1997 .

[39]  M. Kimura,et al.  The Stepping Stone Model of Population Structure and the Decrease of Genetic Correlation with Distance. , 1964, Genetics.

[40]  M. Kronforst,et al.  MULTILOCUS ANALYSES OF ADMIXTURE AND INTROGRESSION AMONG HYBRIDIZING HELICONIUS BUTTERFLIES , 2006, Evolution; international journal of organic evolution.

[41]  Paul Beier,et al.  Circuit theory predicts gene flow in plant and animal populations , 2007, Proceedings of the National Academy of Sciences.

[42]  Pablo Villoslada,et al.  Analysis and Application of European Genetic Substructure Using 300 K SNP Information , 2008, PLoS genetics.

[43]  Matthew Stephens,et al.  Assigning African elephant DNA to geographic region of origin: applications to the ivory trade. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Alkes L. Price,et al.  Reconstructing Indian Population History , 2009, Nature.

[45]  B. Mcrae,et al.  ISOLATION BY RESISTANCE , 2006, Evolution; international journal of organic evolution.

[46]  Kenneth K. Kidd,et al.  Hunter-gatherer genomic diversity suggests a southern African origin for modern humans , 2011, Proceedings of the National Academy of Sciences.

[47]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[48]  Viral B. Shah,et al.  Using circuit theory to model connectivity in ecology, evolution, and conservation. , 2008, Ecology.

[49]  Joseph K. Pickrell,et al.  Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data , 2012, PLoS genetics.

[50]  J. Hey,et al.  A multi-dimensional coalescent process applied to multi-allelic selection models and migration models. , 1991, Theoretical population biology.

[51]  D. Balding,et al.  Identifying adaptive genetic divergence among populations from genome scans , 2004, Molecular ecology.

[52]  E. Heyer,et al.  Geographic Patterns of (Genetic, Morphologic, Linguistic) Variation: How Barriers Can Be Detected by Using Monmonier's Algorithm , 2004, Human biology.

[53]  Elizabeth L. Ogburn,et al.  Demonstrating stratification in a European American population , 2005, Nature Genetics.

[54]  M. Beaumont,et al.  Evaluating loci for use in the genetic analysis of population structure , 1996, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[55]  G. McVean A Genealogical Interpretation of Principal Components Analysis , 2009, PLoS genetics.

[56]  Mary Katherine Gonder,et al.  Evidence from Cameroon reveals differences in the genetic structure and histories of chimpanzee populations , 2011, Proceedings of the National Academy of Sciences.

[57]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[58]  Peter Beerli,et al.  Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[59]  Mattias Jakobsson,et al.  The Pattern of Polymorphism in Arabidopsis thaliana , 2005, PLoS biology.

[60]  P. McCullagh MARGINAL LIKELIHOOD FOR DISTANCE MATRICES , 2009 .

[61]  Matthew Stephens,et al.  Combating the Illegal Trade in African Elephant Ivory with DNA Forensics , 2008, Conservation biology : the journal of the Society for Conservation Biology.

[62]  Mevin B. Hooten,et al.  Circuit Theory and Model-Based Inference for Landscape Connectivity , 2013 .

[63]  Eran Halperin,et al.  Enhanced localization of genetic samples through linkage-disequilibrium correction. , 2013, American journal of human genetics.

[64]  L. Cavalli-Sforza,et al.  PHYLOGENETIC ANALYSIS: MODELS AND ESTIMATION PROCEDURES , 1967, Evolution; international journal of organic evolution.

[65]  Jinchuan Xing,et al.  Toward a more uniform sampling of human genetic diversity: a survey of worldwide populations by high-density genotyping. , 2010, Genomics.

[66]  Xiaofeng Zhu,et al.  The landscape of recombination in African Americans , 2011, Nature.