A practical guide to environmental association analysis in landscape genomics

Landscape genomics is an emerging research field that aims to identify the environmental factors that shape adaptive genetic variation and the gene variants that drive local adaptation. Its development has been facilitated by next‐generation sequencing, which allows for screening thousands to millions of single nucleotide polymorphisms in many individuals and populations at reasonable costs. In parallel, data sets describing environmental factors have greatly improved and increasingly become publicly accessible. Accordingly, numerous analytical methods for environmental association studies have been developed. Environmental association analysis identifies genetic variants associated with particular environmental factors and has the potential to uncover adaptive patterns that are not discovered by traditional tests for the detection of outlier loci based on population genetic differentiation. We review methods for conducting environmental association analysis including categorical tests, logistic regressions, matrix correlations, general linear models and mixed effects models. We discuss the advantages and disadvantages of different approaches, provide a list of dedicated software packages and their specific properties, and stress the importance of incorporating neutral genetic structure in the analysis. We also touch on additional important aspects such as sampling design, environmental data preparation, pooled and reduced‐representation sequencing, candidate‐gene approaches, linearity of allele–environment associations and the combination of environmental association analyses with traditional outlier detection tests. We conclude by summarizing expected future directions in the field, such as the extension of statistical approaches, environmental association analysis for ecological gene annotation, and the need for replication and post hoc validation studies.

[1]  J. Huxley,et al.  Clines: an Auxiliary Taxonomic Principle , 1938, Nature.

[2]  George C. Williams,et al.  Adaptation and Natural Selection , 2018 .

[3]  N. Mantel The detection of disease clustering and a generalized regression approach. , 1967, Cancer research.

[4]  R. Sokal,et al.  Multiple regression and correlation extensions of the mantel test of matrix correspondence , 1986 .

[5]  Robert R. Sokal,et al.  An investigation of three-matrix permutation tests , 1992 .

[6]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[7]  M. Beaumont,et al.  Evaluating loci for use in the genetic analysis of population structure , 1996, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[8]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[9]  R. Ostfeld,et al.  Climate Warming and Disease Risks for Terrestrial and Marine Biota , 2002, Science.

[10]  Jonathan D. Gruber,et al.  Estimation of single nucleotide polymorphism allele frequency in DNA pools by using Pyrosequencing , 2002, Human Genetics.

[11]  Pierre Legendre,et al.  All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices , 2002 .

[12]  P. Taberlet,et al.  The power and promise of population genomics: from genotyping to genome typing , 2003, Nature Reviews Genetics.

[13]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  D. Balding,et al.  Identifying adaptive genetic divergence among populations from genome scans , 2004, Molecular ecology.

[15]  Y. Linhart,et al.  Observations on the genetic structure and mating system of ponderosa pine in the Colorado front range , 2004, Theoretical and Applied Genetics.

[16]  Jason Fine,et al.  Estimating equations for association structures , 2004, Statistics in medicine.

[17]  T. Kawecki,et al.  Conceptual issues in local adaptation , 2004 .

[18]  G. Luikart,et al.  SNPs in ecology, evolution and conservation , 2004 .

[19]  J. L. Parra,et al.  Very high resolution interpolated climate surfaces for global land areas , 2005 .

[20]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[21]  ADAPTATION TO A STEEP ENVIRONMENTAL GRADIENT AND AN ASSOCIATED BARRIER TO GENE EXCHANGE IN LITTORINA SAXATILIS , 2006, Evolution; international journal of organic evolution.

[22]  Oscar Gaggiotti,et al.  Identifying the Environmental Factors That Determine the Genetic Structure of Populations , 2006, Genetics.

[23]  Stéphane Dray,et al.  Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM) , 2006 .

[24]  Amanda B. Hepler,et al.  Genetic relatedness analysis: modern data and new challenges , 2006, Nature Reviews Genetics.

[25]  P Taberlet,et al.  A spatial analysis method (SAM) to detect candidate loci for selection: towards a landscape genomics approach to adaptation , 2007, Molecular ecology.

[26]  Edward S. Buckler,et al.  TASSEL: software for association mapping of complex traits in diverse samples , 2007, Bioinform..

[27]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: dominant markers and null alleles , 2007, Molecular ecology notes.

[28]  P. Legendre,et al.  vegan : Community Ecology Package. R package version 1.8-5 , 2007 .

[29]  Sarah C. Goslee,et al.  The ecodist Package for Dissimilarity-based Analysis of Ecological Data , 2007 .

[30]  C. Landry,et al.  Ecological annotation of genes and genomes through ecological genomics , 2007, Molecular ecology.

[31]  G. Carl,et al.  Analyzing spatial autocorrelation in species distributions using Gaussian and logit models , 2007 .

[32]  David B. Witonsky,et al.  Adaptations to Climate in Candidate Genes for Common Metabolic Disorders , 2008, PLoS genetics.

[33]  L. Excoffier,et al.  Surfing during population expansions promotes genetic revolutions and structuration. , 2008, Trends in ecology & evolution.

[34]  F. Bonhomme,et al.  Ecological genetics in the North Atlantic: environmental gradients and adaptation at specific loci. , 2008, Ecology.

[35]  S. Joost,et al.  Spatial analysis method (sam): a software tool combining molecular and environmental data to identify candidate loci for selection , 2008, Molecular ecology resources.

[36]  P. Taberlet,et al.  Land ahead: using genome scans to identify molecular markers of adaptive relevance , 2008 .

[37]  D. Schluter,et al.  Adaptation from standing genetic variation. , 2008, Trends in ecology & evolution.

[38]  O. Gaggiotti,et al.  A Genome-Scan Method to Identify Selected Loci Appropriate for Both Dominant and Codominant Markers: A Bayesian Perspective , 2008, Genetics.

[39]  D. Heckerman,et al.  Efficient Control of Population Structure in Model Organism Association Mapping , 2008, Genetics.

[40]  H. Hoekstra,et al.  Combining population genomics and quantitative genetics: finding the genes underlying ecologically important traits , 2008, Heredity.

[41]  K. Holsinger,et al.  Genetics in geographically structured populations: defining, estimating and interpreting FST , 2009, Nature Reviews Genetics.

[42]  J. Hemmer-Hansen,et al.  Genomic signatures of local directional selection in a high gene flow marine organism; the Atlantic cod (Gadus morhua) , 2009, BMC Evolutionary Biology.

[43]  S. Keller,et al.  Adaptation and colonization history affect the evolution of clines in two introduced species. , 2009, The New phytologist.

[44]  David B Neale,et al.  Association Genetics of Coastal Douglas Fir (Pseudotsuga menziesii var. menziesii, Pinaceae). I. Cold-Hardiness Related Traits , 2009, Genetics.

[45]  D. J. Funk,et al.  Divergent selection and heterogeneous genomic divergence , 2009, Molecular ecology.

[46]  John Novembre,et al.  Spatial patterns of variation due to natural selection in humans , 2009, Nature Reviews Genetics.

[47]  L. Excoffier,et al.  Detecting loci under selection in a hierarchically structured population , 2009, Heredity.

[48]  M. Allaby A Dictionary of Zoology , 2009 .

[49]  A. Futschik,et al.  The Next Generation of Molecular Markers From Massively Parallel Sequencing of Pooled DNA Samples , 2010, Genetics.

[50]  F. Gugerli,et al.  Landscape genetics of plants. , 2010, Trends in plant science.

[51]  C. Chevalet,et al.  Detecting Selection in Population Trees: The Lewontin and Krakauer Test Extended , 2010, Genetics.

[52]  David B. Witonsky,et al.  Adaptations to new environments in humans: the role of subtle allele frequency shifts , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[53]  Jonathan K. Pritchard,et al.  Adaptation – not by sweeps alone , 2010, Nature Reviews Genetics.

[54]  Tina T. Hu,et al.  Population resequencing reveals local adaptation of Arabidopsis lyrata to serpentine soils , 2010, Nature Genetics.

[55]  Detecting selection in population trees: the Lewontin and Krakauer test extended. , 2010, Genetics.

[56]  Nicholas Stiffler,et al.  Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags , 2010, PLoS genetics.

[57]  P. Legendre,et al.  Common factors drive adaptive genetic variation at different spatial scales in Arabis alpina , 2010, Molecular ecology.

[58]  P. Phillips,et al.  Using Population Genomics to Detect Selection in Natural Populations: Key Concepts and Methodological Considerations , 2010, International Journal of Plant Sciences.

[59]  P. Taberlet,et al.  Tracking genes of ecological relevance using a genome scan in two independent regional population samples of Arabis alpina , 2010, Molecular ecology.

[60]  D. Neale,et al.  Patterns of Population Structure and Environmental Associations to Aridity Across the Range of Loblolly Pine (Pinus taeda L., Pinaceae) , 2010, Genetics.

[61]  Kevin J. Emerson,et al.  Resolving postglacial phylogeography using high-throughput sequencing , 2010, Proceedings of the National Academy of Sciences.

[62]  R. Wayne,et al.  Spatial modelling and landscape‐level approaches for visualizing intra‐specific variation , 2010, Molecular ecology.

[63]  G. Coop,et al.  Back to nature: ecological genomics of loblolly pine (Pinus taeda, Pinaceae) , 2010, Molecular ecology.

[64]  M. Fortin,et al.  Comparison of the Mantel test and alternative approaches for detecting complex multivariate relationships in the spatial analysis of genetic data , 2010, Molecular ecology resources.

[65]  M. Fortin,et al.  Considering spatial and temporal scale in landscape‐genetic studies of gene flow , 2010, Molecular ecology.

[66]  M. Fortin,et al.  Perspectives on the use of landscape genetics to detect genetic adaptive variation in the field , 2010, Molecular ecology.

[67]  L. Excoffier,et al.  Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows , 2010, Molecular ecology resources.

[68]  David B. Witonsky,et al.  Using Environmental Correlations to Identify Loci Underlying Local Adaptation , 2010, Genetics.

[69]  M. Yuen,et al.  Postglacial history of a widespread conifer produces inverse clines in selective neutrality tests. , 2010, Molecular ecology.

[70]  G. Turesson The Genotypical Response of the Plant Species to the Habitat , 2010 .

[71]  M. Blaxter,et al.  Genome-wide genetic marker discovery and genotyping using next-generation sequencing , 2011, Nature Reviews Genetics.

[72]  Allele discovery of ten candidate drought-response genes in Austrian oak using a systematically informatics approach based on 454 amplicon sequencing , 2012, BMC Research Notes.

[73]  J. Galindo,et al.  Applications of next generation sequencing in molecular ecology of non-model organisms , 2011, Heredity.

[74]  J. Shendure,et al.  Exome sequencing as a tool for Mendelian disease gene discovery , 2011, Nature Reviews Genetics.

[75]  R. Alía,et al.  Molecular footprints of local adaptation in two Mediterranean conifers. , 2011, Molecular biology and evolution.

[76]  Joy Bergelson,et al.  References and Notes Supporting Online Material Adaptation to Climate across the Arabidopsis Thaliana Genome , 2022 .

[77]  M. Sillanpää,et al.  Overview of techniques to account for confounding due to population stratification and cryptic relatedness in genomic data association analyses , 2011, Heredity.

[78]  A. Laurila,et al.  Genetic analysis of differentiation among breeding ponds reveals a candidate gene for local adaptation in Rana arvalis , 2011, Molecular ecology.

[79]  Jonathan K. Pritchard,et al.  Adaptations to Climate-Mediated Selective Pressures in Humans , 2011, PLoS genetics.

[80]  Gilles Guillot,et al.  Dismantling the Mantel tests , 2011, 1112.0651.

[81]  Robert Kofler,et al.  PoPoolation2: identifying differentiation between populations using sequencing of pooled DNA samples (Pool-Seq) , 2011, Bioinform..

[82]  H. Hoekstra,et al.  Molecular spandrels: tests of adaptation at the genetic level , 2011, Nature Reviews Genetics.

[83]  M. Nordborg,et al.  A Map of Local Adaptation in Arabidopsis thaliana , 2011, Science.

[84]  J. Jokela,et al.  Analysis of trematode parasite communities in fish eye lenses by pyrosequencing of naturally pooled DNA. , 2011, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[85]  T. Juenger,et al.  Characterizing genomic variation of Arabidopsis thaliana: the roles of geography and climate , 2012, Molecular ecology.

[86]  Nigel G Yoccoz,et al.  Sampling in landscape genomics. , 2012, Methods in molecular biology.

[87]  P. Taberlet,et al.  An outlier locus relevant in habitat-mediated selection in an alpine plant across independent regional replicates , 2013, Evolutionary Ecology.

[88]  O. François,et al.  Adaptive Genetic Variation on the Landscape: Methods and Cases , 2012 .

[89]  Aaron A. Comeault,et al.  Genomic consequences of multiple speciation processes in a stick insect , 2012, Proceedings of the Royal Society B: Biological Sciences.

[90]  H. Hoekstra,et al.  Molecular spandrels: tests of adaptation at the genetic level , 2011, Nature Reviews Genetics.

[91]  D. Neale,et al.  Disentangling the Roles of History and Local Selection in Shaping Clinal Variation of Allele Frequencies and Gene Expression in Norway Spruce (Picea abies) , 2012, Genetics.

[92]  C. Körner Alpine Treelines , 2012, Springer Basel.

[93]  Christian R Landry,et al.  What is needed for next-generation ecological and evolutionary genomics? , 2012, Trends in ecology & evolution.

[94]  C. Schlötterer,et al.  Genome-wide patterns of latitudinal differentiation among populations of Drosophila melanogaster from North America , 2012, Molecular ecology.

[95]  D. Neale,et al.  The geographical and environmental determinants of genetic diversity for four alpine conifers of the European Alps , 2012, Molecular ecology.

[96]  P. Taberlet,et al.  Broad‐scale adaptive genetic variation in alpine plants is driven by temperature and precipitation , 2012, Molecular ecology.

[97]  Jared L. Strasburg,et al.  What can patterns of differentiation across plant genomes tell us about adaptation and speciation? , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[98]  J. Beaulieu,et al.  Parallel and lineage‐specific molecular adaptation to climate in boreal black spruce , 2012, Molecular ecology.

[99]  A. Auton,et al.  Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel , 2011, Nature Genetics.

[100]  Guillaume Bouchard,et al.  Testing for Associations between Loci and Environmental Gradients Using Latent Factor Mixed Models , 2012, Molecular biology and evolution.

[101]  Stefan Zoller,et al.  Validation of SNP Allele Frequencies Determined by Pooled Next-Generation Sequencing in Natural Populations of a Non-Model Plant Species , 2013, PloS one.

[102]  Bjarni J. Vilhjálmsson,et al.  The nature of confounding in genome-wide association studies , 2012, Nature Reviews Genetics.

[103]  F. Gugerli,et al.  Population genomic footprints of selection and associations with climate in natural populations of Arabidopsis halleri from the Alps , 2013, Molecular ecology.

[104]  F. Gugerli,et al.  Are adaptive loci transferable across genomes of related species? Outlier and environmental association analyses in Alpine Brassicaceae species , 2013, Molecular ecology.

[105]  M. Feldman,et al.  On the stability of the Bayenv method in assessing human SNP-environment associations , 2014, Human Genomics.

[106]  R. V. Adams,et al.  INTEGRATING LANDSCAPE GENOMICS AND SPATIALLY EXPLICIT APPROACHES TO DETECT LOCI UNDER SELECTION IN CLINAL POPULATIONS , 2013, Evolution; international journal of organic evolution.

[107]  Stéphanie Manel,et al.  Ten years of landscape genetics. , 2013, Trends in ecology & evolution.

[108]  M. J. Davis,et al.  Annotated genes and nonannotated genomes: cross‐species use of Gene Ontology in ecology and evolution research , 2013, Molecular ecology.

[109]  Damaris Zurell,et al.  Collinearity: a review of methods to deal with it and a simulation study evaluating their performance , 2013 .

[110]  L. Bernatchez,et al.  LANDSCAPE GENOMICS IN ATLANTIC SALMON (SALMO SALAR): SEARCHING FOR GENE–ENVIRONMENT INTERACTIONS DRIVING LOCAL ADAPTATION , 2013, Evolution; international journal of organic evolution.

[111]  M. Lascoux,et al.  Ecological genomics of local adaptation , 2013, Nature Reviews Genetics.

[112]  S. Krauss,et al.  Signatures of diversifying selection at EST‐SSR loci and association with climate in natural Eucalyptus populations , 2013, Molecular ecology.

[113]  A. Korte,et al.  The advantages and limitations of trait analysis with GWAS: a review , 2013, Plant Methods.

[114]  L. Rieseberg,et al.  Genomic evidence for the parallel evolution of coastal forms in the Senecio lautus complex , 2013, Molecular ecology.

[115]  G. Coop,et al.  Robust Identification of Local Adaptation from Allele Frequencies , 2012, Genetics.

[116]  Nourollah Ahmadi,et al.  Detecting selection along environmental gradients: analysis of eight methods and their effectiveness for outbreeding and selfing populations , 2013, Molecular ecology.

[117]  Melanie Smith,et al.  Alpine Treelines: Functional Ecology of the Global High Elevation Tree Limits , 2013 .

[118]  R. J. Dyer,et al.  Putting the landscape into the genomics of trees: approaches for understanding local adaptation and population responses to changing climate , 2013, Tree Genetics & Genomes.

[119]  H. Jactel,et al.  Community genetics in the time of next‐generation molecular technologies , 2013, Molecular ecology.

[120]  Stéphane Joost,et al.  Uncovering the genetic basis of adaptive change: on the intersection of landscape genomics and theoretical population genetics , 2013, Molecular ecology.

[121]  N. Young,et al.  Genomic Signature of Adaptation to Climate in Medicago truncatula , 2014, Genetics.

[122]  A. Laurila,et al.  AFLPs and Mitochondrial Haplotypes Reveal Local Adaptation to Extreme Thermal Environments in a Freshwater Gastropod , 2014, PloS one.

[123]  M. Bayer,et al.  Genome-Tagged Amplification (GTA): a PCR-based method to prepare sample-tagged amplicons from hundreds of individuals for next generation sequencing , 2014, Molecular Breeding.

[124]  Renaud Vitalis,et al.  Detecting correlation between allele frequencies and environmental variables as a signature of selection. A fast computational approach for genome-wide studies , 2014 .

[125]  Stéphane Joost,et al.  High performance computation of landscape genomic models integrating local indices of spatial association , 2014 .

[126]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[127]  Daniel I Bolnick,et al.  Microgeographic adaptation and the spatial scale of evolution. , 2014, Trends in ecology & evolution.

[128]  P. Tiffin,et al.  Advances and limits of using population genetics to understand local adaptation. , 2014, Trends in ecology & evolution.

[129]  Timothy H. Keitt,et al.  Natural Variation in Abiotic Stress Responsive Gene Expression and Local Adaptation to Climate in Arabidopsis thaliana , 2014, Molecular biology and evolution.

[130]  D. Bolnick,et al.  Demystifying the RAD fad , 2014, Molecular ecology.

[131]  C. Schlötterer,et al.  Sequencing pools of individuals — mining genome-wide polymorphism data without big funding , 2014, Nature Reviews Genetics.

[132]  F. Gugerli,et al.  Validation of outlier loci through replication in independent data sets: a test on Arabis alpina , 2014, Ecology and evolution.

[133]  S. Lien,et al.  Footprints of Directional Selection in Wild Atlantic Salmon Populations: Evidence for Parasite-Driven Evolution? , 2014, PloS one.

[134]  H. H. Bruun,et al.  Landscape genomics and a common garden trial reveal adaptive differentiation to temperature across Europe in the tree species Alnus glutinosa , 2014, Molecular ecology.

[135]  Josephine T. Daub,et al.  Widespread signals of convergent adaptation to high altitude in Asia and America , 2014, bioRxiv.

[136]  M. Whitlock,et al.  Evaluation of demographic history and neutral parameterization on the performance of FST outlier tests , 2014, Molecular ecology.

[137]  G. Coop,et al.  A Population Genetic Signal of Polygenic Adaptation , 2013, PLoS genetics.

[138]  É. Frichot,et al.  Genome scan methods against more complex models: when and how much should we trust them? , 2014, Molecular ecology.

[139]  Jason G. Bragg,et al.  Genomic variation across landscapes: insights and applications. , 2015, The New phytologist.

[140]  M. Whitlock,et al.  The relative power of genome scans to detect local adaptation depends on sampling design and statistical method , 2015, Molecular ecology.

[141]  D. Bates,et al.  Linear Mixed-Effects Models using 'Eigen' and S4 , 2015 .

[142]  Kevin Leempoel,et al.  Very high‐resolution digital elevation models: are multi‐scale derived variables ecologically relevant? , 2015 .

[143]  É. Frichot,et al.  LEA: An R package for landscape and ecological association studies , 2015 .

[144]  C. Jiggins,et al.  Towards the identification of the loci of adaptive evolution , 2015, Methods in ecology and evolution.

[145]  O. Gaggiotti,et al.  A new FST‐based method to uncover local adaptation using environmental variables , 2015 .

[146]  É. Frichot,et al.  Detecting adaptive evolution based on association with ecological gradients: Orientation matters! , 2015, Heredity.

[147]  S Stucki,et al.  High performance computation of landscape genomic models including local indicators of spatial association , 2014, Molecular ecology resources.