Statistical analysis of amplified fragment length polymorphism data: a toolbox for molecular ecologists and evolutionists

Recently, the amplified fragment length polymorphism (AFLP) technique has gained a lot of popularity, and is now frequently applied to a wide variety of organisms. Technical specificities of the AFLP procedure have been well documented over the years, but there is on the contrary little or scattered information about the statistical analysis of AFLPs. In this review, we describe the various methods available to handle AFLP data, focusing on four research topics at the population or individual level of analysis: (i) assessment of genetic diversity; (ii) identification of population structure; (iii) identification of hybrid individuals; and (iv) detection of markers associated with phenotypes. Two kinds of analysis methods can be distinguished, depending on whether they are based on the direct study of band presences or absences in AFLP profiles (‘band‐based’ methods), or on allelic frequencies estimated at each locus from these profiles (‘allele frequency‐based’ methods). We investigate the characteristics and limitations of these statistical tools; finally, we appeal for a wider adoption of methodologies borrowed from other research fields, like for example those especially designed to deal with binary data.

[1]  Jukka Corander,et al.  Bayesian spatial modeling of genetic population structure , 2008, Comput. Stat..

[2]  P Taberlet,et al.  A spatial analysis method (SAM) to detect candidate loci for selection: towards a landscape genomics approach to adaptation , 2007, Molecular ecology.

[3]  S. Manel,et al.  WOMBSOFT: an R package that implements the Wombling method to identify genetic boundary , 2007 .

[4]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: dominant markers and null alleles , 2007, Molecular ecology notes.

[5]  Heidi M. Meudt,et al.  Almost forgotten or latest practice? AFLP applications, analyses and advances. , 2007, Trends in plant science.

[6]  Mark P. Simmons,et al.  A penalty of using anonymous dominant markers (AFLPs, ISSRs, and RAPDs) for phylogenetic inference. , 2007, Molecular phylogenetics and evolution.

[7]  L. Bernatchez,et al.  Linkage Maps of the dwarf and Normal Lake Whitefish (Coregonus clupeaformis) Species Complex and Their Hybrids Reveal the Genetic Architecture of Population Divergence , 2007, Genetics.

[8]  Quantitative Geography and Genomics : Spatial analysis to detect signatures of selection along a gradient of altitude in the common frog (Rana temporaria) , 2007 .

[9]  Sophie Ancelet,et al.  Bayesian Clustering Using Hidden Markov Random Fields in Spatial Population Genetics , 2006, Genetics.

[10]  L. Excoffier,et al.  Computer programs for population genetics data analysis: a survival guide , 2006, Nature Reviews Genetics.

[11]  C. Allinne,et al.  Genetic diversity and gene flow among pearl millet crop/weed complex: a case study , 2006, Theoretical and Applied Genetics.

[12]  D. Nacci,et al.  Genetic diversity and structure of an estuarine fish (Fundulus heteroclitus) indigenous to sites associated with a highly contaminated urban harbor , 2006, Ecotoxicology.

[13]  J. Peñuelas,et al.  Natural selection and climate change: temperature‐linked spatial and temporal trends in gene frequency in Fagus sylvatica , 2006, Molecular ecology.

[14]  P. Schönswetter,et al.  Extensive gene flow blurs phylogeographic but not phylogenetic signal in Olea europaea L. , 2006, Theoretical and Applied Genetics.

[15]  O. Paun,et al.  The role of hybridization, polyploidization and glaciation in the origin and evolution of the apomictic Ranunculus cassubicus complex. , 2006, The New phytologist.

[16]  A. Hild,et al.  Potential selection in native grass populations by exotic invasion , 2006, Molecular ecology.

[17]  Nianjun Liu,et al.  PSMIX: an R package for population structure inference via maximum likelihood method , 2006, BMC Bioinformatics.

[18]  P. Schönswetter,et al.  ‘Sax‐sess’— genetics of primary succession in a pioneer species on two parallel glacier forelands , 2006, Molecular ecology.

[19]  L. Bernatchez,et al.  Natural hybrids in Atlantic eels (Anguilla anguilla, A. rostrata): evidence for successful reproduction and fluctuating abundance in space and time , 2006, Molecular ecology.

[20]  C. Chevalet,et al.  Genetic diversity in European pigs utilizing amplified fragment length polymorphism markers. , 2006, Animal genetics.

[21]  P Taberlet,et al.  Genetic structure of the forest pest Hylobius abietis on conifer plantations at different spatial scales in Europe , 2006, Heredity.

[22]  Nicolas Salamin,et al.  Sympatric speciation in palms on an oceanic island , 2006, Nature.

[23]  M. Groenen,et al.  Genetic diversity analysis using lowly polymorphic dominant markers: the example of AFLP in pigs. , 2006, The Journal of heredity.

[24]  Christian Brochmann,et al.  Refugia, differentiation and postglacial migration in arctic‐alpine Eurasia, exemplified by the mountain avens (Dryas octopetala L.) , 2006, Molecular ecology.

[25]  François Pompanon,et al.  Explorative genome scan to detect candidate loci for adaptation along a gradient of altitude in the common frog (Rana temporaria). , 2006, Molecular biology and evolution.

[26]  P. Smouse,et al.  genalex 6: genetic analysis in Excel. Population genetic software for teaching and research , 2006 .

[27]  R. Kraehenbuehl,et al.  Genetic diversity and pathogenicity of the grass pathogen Xanthomonas translucens pv. graminis. , 2006, Systematic and applied microbiology.

[28]  Mark G. M. Aarts,et al.  Construction of a genetic linkage map of Thlaspi caerulescens and quantitative trait loci analysis of zinc accumulation. , 2006, The New phytologist.

[29]  A. Meyer,et al.  Sympatric speciation in Nicaraguan crater lake cichlid fish , 2006, Nature.

[30]  K. Gardner,et al.  Identifying loci under selection across contrasting environments in Avena barbata using quantitative trait locus mapping , 2006, Molecular ecology.

[31]  A. Doligez,et al.  Fine‐scale genetic structure and gene dispersal inferences in 10 Neotropical tree species , 2005, Molecular ecology.

[32]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[33]  T. Aide,et al.  The influence of spatial scale on the genetic structure of a widespread tropical wetland tree, Pterocarpus officinalis (Fabaceae) , 2006, Conservation Genetics.

[34]  R. Nielsen Molecular signatures of natural selection. , 2005, Annual review of genetics.

[35]  P. Taberlet,et al.  Genotyping errors: causes, consequences and solutions , 2005, Nature Reviews Genetics.

[36]  Mark P. Miller,et al.  Alleles in space (AIS): computer software for the joint analysis of interindividual spatial and genetic information. , 2005, The Journal of heredity.

[37]  G. Gheysen,et al.  Monitoring genetic diversity in tropical trees with multilocus dominant markers , 2005, Heredity.

[38]  R. Margis,et al.  Optimal sampling strategy for estimation of spatial genetic structure in tree populations , 2005, Heredity.

[39]  S. Bensch,et al.  Ten years of AFLP in ecology and evolution: why so few animals? , 2005, Molecular ecology.

[40]  C. Buerkle,et al.  Maximum‐likelihood estimation of a hybrid index based on molecular markers , 2005 .

[41]  K. Ritland,et al.  Multilocus estimation of pairwise relatedness with dominant markers , 2005, Molecular ecology.

[42]  P. Schönswetter,et al.  Vicariance and dispersal in the alpine perennial Bupleurum stellatum L. (Apiaceae) , 2005 .

[43]  Arnaud Estoup,et al.  A Spatial Statistical Model for Landscape Genetics , 2005, Genetics.

[44]  A. Jäkäläniemi,et al.  Local genetic population structure in an endangered plant species, Silene tatarica (Caryophyllaceae) , 2005, Heredity.

[45]  J. Mallet Hybridization as an invasion of the genome. , 2005, Trends in ecology & evolution.

[46]  A. Helbig,et al.  Genetic differentiation and hybridization between greater and lesser spotted eagles (Accipitriformes:Aquila clanga, A. pomarina) , 2005, Journal of Ornithology.

[47]  W. Koopman Phylogenetic signal in AFLP data sets. , 2005, Systematic biology.

[48]  F. Ehrendorfer,et al.  AFLP analyses demonstrate genetic divergence, hybridization, and multiple polyploidization in the evolution of Achillea (Asteraceae-Anthemideae). , 2005, The New phytologist.

[49]  Stephanie Manel,et al.  Assignment methods: matching biological questions with appropriate techniques. , 2005, Trends in ecology & evolution.

[50]  P. Gupta,et al.  Linkage disequilibrium and association studies in higher plants: Present status and future prospects , 2005, Plant Molecular Biology.

[51]  K. Leonard,et al.  Similarity coefficients for molecular markers in studies of genetic relationships between individuals for haploid, diploid, and polyploid species , 2005, Molecular ecology.

[52]  S. Bensch,et al.  Speciation by Distance in a Ring Species , 2005, Science.

[53]  J. F. Storz,et al.  INVITED REVIEW: Using genome scans of DNA polymorphism to infer adaptive population divergence , 2005, Molecular ecology.

[54]  L. Bernatchez,et al.  FAST‐TRACK: Integrating QTL mapping and genome scans towards the characterization of candidate loci under parallel selection in the lake whitefish (Coregonus clupeaformis) , 2004, Molecular ecology.

[55]  Matthias Frisch,et al.  Genetical and Mathematical Properties of Similarity and Dissimilarity Coefficients Applied in Plant Breeding and Seed Bank Management , 2005 .

[56]  J. Tohme,et al.  Use of AFLP markers in surveys of plant diversity. , 2005, Methods in enzymology.

[57]  T. Mendelson,et al.  Use of AFLP markers in surveys of arthropod diversity. , 2005, Methods in enzymology.

[58]  S. Narum,et al.  Beyond Bonferroni: Less conservative analyses for conservation genetics , 2005, Conservation Genetics.

[59]  T. Blake,et al.  Optimum Sample Size for Estimating Gene Diversity in Wild Wheat using AFLP Markers , 2006, Genetic Resources and Crop Evolution.

[60]  P. Taberlet,et al.  How to track and assess genotyping errors in population genetics studies , 2004, Molecular ecology.

[61]  C. Scotti-Saintagne,et al.  Genome Scanning for Interspecific Differentiation Between Two Closely Related Oak Species [Quercus robur L. and Q. petraea (Matt.) Liebl.] , 2004, Genetics.

[62]  M. P. Cummings PHYLIP (Phylogeny Inference Package) , 2004 .

[63]  Jukka Corander,et al.  BAPS 2: enhanced possibilities for the analysis of genetic population structure , 2004, Bioinform..

[64]  J. Wang,et al.  Estimating pairwise relatedness from dominant genetic markers , 2004, Molecular ecology.

[65]  Deborah A Nickerson,et al.  Population History and Natural Selection Shape Patterns of Genetic Variation in 132 Genes , 2004, PLoS biology.

[66]  C. Fenster,et al.  Quantitative trait locus analyses and the study of evolutionary process , 2004, Molecular ecology.

[67]  G. Gort,et al.  Significance Tests and Weighted Values for AFLP Similarities, Based on Arabidopsis in Silico AFLP Fragment Length Distributions , 2004, Genetics.

[68]  E. Heyer,et al.  Geographic Patterns of (Genetic, Morphologic, Linguistic) Variation: How Barriers Can Be Detected by Using Monmonier's Algorithm , 2004, Human biology.

[69]  P. Hollingsworth,et al.  Neighbour joining trees, dominant markers and population genetic structure , 2004, Heredity.

[70]  S. Manel,et al.  Genetic diversity and differentiation in Eryngium alpinum L. (Apiaceae): comparison of AFLP and microsatellite markers , 2004, Heredity.

[71]  E. Nevo,et al.  Mammalian microevolution in action: adaptive edaphic genomic divergence in blind subterranean mole–rats , 2004, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[72]  Kent E. Holsinger,et al.  Analysis of Genetic Diversity in Geographically Structured Populations: A Bayesian Perspective , 2004 .

[73]  L. Bernatchez,et al.  Generic scan using AFLP markers as a means to assess the role of directional selection in the divergence of sympatric whitefish ecotypes. , 2004, Molecular biology and evolution.

[74]  H. Nybom Comparison of different nuclear DNA markers for estimating intraspecific genetic diversity in plants , 2004, Molecular ecology.

[75]  B. Weir,et al.  Moment estimation of population diversity and genetic distance from data on recessive markers * , 2004, Molecular ecology.

[76]  P. Schönswetter,et al.  Glacial history of high alpine Ranunculus glacialis (Ranunculaceae) in the European Alps in a comparative phylogeographical context , 2004 .

[77]  E. Nevo,et al.  THE GENETIC BASIS OF ADAPTIVE POPULATION DIFFERENTIATION: A QUANTITATIVE TRAIT LOCUS ANALYSIS OF FITNESS TRAITS IN TWO WILD BARLEY POPULATIONS FROM CONTRASTING HABITATS , 2004, Evolution; international journal of organic evolution.

[78]  J. Mallet,et al.  Genomic evidence for divergence with gene flow in host races of the larch budmoth , 2004, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[79]  Arnaud Estoup,et al.  Genetic assignment methods for the direct, real‐time estimation of migration rate: a simulation‐based exploration of accuracy and power , 2004, Molecular ecology.

[80]  L. Knowles,et al.  The burgeoning field of statistical phylogeography , 2003, Journal of evolutionary biology.

[81]  B. Baum,et al.  Sequence assessment of comigrating AFLPTM bands in Echinacea -- implications for comparative biological studies. , 2004, Genome.

[82]  G. Yan,et al.  Geographical patterns of genetic variation in the world collections of wild annual Cicer characterized by amplified fragment length polymorphisms , 2004, Theoretical and Applied Genetics.

[83]  L. Bernatchez,et al.  Combining the analyses of introgressive hybridisation and linkage mapping to investigate the genetic architecture of population divergence in the lake whitefish (Coregonus clupeaformis, Mitchill) , 2004, Genetica.

[84]  P. Taberlet,et al.  The power and promise of population genomics: from genotyping to genome typing , 2003, Nature Reviews Genetics.

[85]  S. Edwards,et al.  RECONCILING ACTUAL AND INFERRED POPULATION HISTORIES IN THE HOUSE FINCH (CARPODACUS MEXICANUS) BY AFLP ANALYSIS , 2003, Evolution; international journal of organic evolution.

[86]  E. Nevo,et al.  Evolution of genomic diversity and sex at extreme environments: Fungal life under hypersaline Dead Sea stress , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[87]  E. Kosman Nei's gene diversity and the index of average differences are identical measures of diversity within populations , 2003 .

[88]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. , 2003, Genetics.

[89]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[90]  Pierre Duchesne,et al.  AFLP utility for population assignment studies: analytical investigation and empirical comparison with microsatellites , 2003, Molecular ecology.

[91]  Seyed Abolghasem Mohammadi,et al.  Analysis of Genetic Diversity in Crop Plants—Salient Statistical Tools and Considerations , 2003 .

[92]  O. Hardy,et al.  Estimation of pairwise relatedness between individuals and characterization of isolation‐by‐distance processes using dominant genetic markers , 2003, Molecular ecology.

[93]  Stephane Rombauts,et al.  AFLPinSilico, simulating AFLP fingerprints , 2003, Bioinform..

[94]  M. Sillanpää,et al.  Bayesian analysis of genetic differentiation between populations. , 2003, Genetics.

[95]  O. Hardy,et al.  spagedi: a versatile computer program to analyse spatial genetic structure at the individual or population levels , 2002 .

[96]  S. Åkesson,et al.  The use of AFLP to find an informative SNP: genetic differences across a migratory divide in willow warblers , 2002, Molecular ecology.

[97]  L. Després,et al.  Geographic pattern of genetic variation in the European globeflower Trollius europaeus L. (Ranunculaceae) inferred from amplified fragment length polymorphism markers , 2002, Molecular ecology.

[98]  L. V. van Zutphen,et al.  Application of AFLP markers for QTL mapping in the rabbit. , 2002, Genome.

[99]  P. Duchesne,et al.  aflpop: a computer program for simulated and real population allocation, based on AFLP data , 2002 .

[100]  J. Lenstra,et al.  Genetic distances within and across cattle breeds as indicated by biallelic AFLP markers. , 2002, Animal genetics.

[101]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[102]  V. Le Corre,et al.  Sampling within the genome for measuring within‐population diversity: trade‐offs between markers , 2002, Molecular ecology.

[103]  Dipak K Dey,et al.  A Bayesian approach to inferring population structure from dominant markers , 2002, Molecular ecology.

[104]  A. Helbig,et al.  Amplified fragment length polymorphism analysis identifies hybrids between two subspecies of warblers , 2002, Molecular ecology.

[105]  E. Thompson,et al.  A model-based method for identifying species hybrids using multilocus genetic data. , 2002, Genetics.

[106]  I. Roldán‐Ruiz,et al.  Data from amplified fragment length polymorphism (AFLP) markers show indication of size homoplasy and of a relationship between degree of homoplasy and fragment size , 2002, Molecular ecology.

[107]  T. Mackay The genetic architecture of quantitative traits. , 2001, Annual review of genetics.

[108]  I. Dupanloup,et al.  Identification of interspecific hybrids by amplified fragment length polymorphism: the case of sturgeon , 2001, Molecular ecology.

[109]  A. Kremer,et al.  SGS--Spatial Genetic Software: a computer program for analysis of spatial genetic and phenotypic structures of individuals and populations. , 2001, The Journal of heredity.

[110]  R. Butlin,et al.  Differential gene exchange between parapatric morphs of Littorina saxatilis detected using AFLP markers , 2001 .

[111]  D. Hawthorne AFLP-based genetic linkage map of the Colorado potato beetle Leptinotarsa decemlineata: sex chromosomes and a pyrethroid-resistance candidate gene. , 2001, Genetics.

[112]  D. Chagné,et al.  Genetic diversity within and among Pinus pinaster populations: comparison between AFLP and microsatellite markers , 2001, Heredity.

[113]  A. Kilian,et al.  Diversity arrays: a solid state technology for sequence information independent genotyping. , 2001, Nucleic acids research.

[114]  R. Borowsky Estimating nucleotide diversity from random amplified polymorphic DNA and amplified fragment length polymorphism data. , 2001, Molecular phylogenetics and evolution.

[115]  C. Baer,et al.  Population genomics: genome-wide sampling of insect populations. , 2001, Annual review of entomology.

[116]  S. Krauss Accurate gene diversity estimates from amplified fragment length polymorphism (AFLP) markers , 2000, Molecular ecology.

[117]  R. Peakall,et al.  A Simple method for the detection of size homoplasy among amplified fragment length polymorphism fragments , 2000, Molecular ecology.

[118]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[119]  K. Walder,et al.  AFLP fingerprinting of the human genome , 2000, Human Genetics.

[120]  S. Manel,et al.  Alternative methods for predicting species distribution: an illustration with Himalayan river birds , 1999 .

[121]  U. Mueller,et al.  AFLP genotyping and fingerprinting. , 1999, Trends in ecology & evolution.

[122]  J. M. Duarte,et al.  Comparison of similarity coefficients based on RAPD markers in the common bean , 1999 .

[123]  V. Sork,et al.  Landscape approaches to historical and contemporary gene flow in plants. , 1999, Trends in ecology & evolution.

[124]  L. Zhivotovsky Estimating population structure in diploids with multilocus dominant DNA markers , 1999, Molecular ecology.

[125]  Keith A. Gardner,et al.  Hybrid zones and the genetic architecture of a barrier to gene flow between two sunflower species. , 1999, Genetics.

[126]  T. Kraft,et al.  Evaluation of AFLP in Beta , 1999, Theoretical and Applied Genetics.

[127]  J. Beaulieu,et al.  Direct evidence for biased gene diversity estimates from dominant random amplified polymorphic DNA (RAPD) fingerprints , 1999 .

[128]  R. Terauchi,et al.  A method for estimating nucleotide diversity from AFLP data. , 1999, Genetics.

[129]  Q. Xiang,et al.  Assessing hybridization in natural populations of Penstemon (Scrophulariaceae) using hypervariable intersimple sequence repeat (ISSR) bands , 1998, Molecular ecology.

[130]  J. Trevors,et al.  Amplified fragment length polymorphism (AFLP): a review of the procedure and its applications , 1998, Journal of Industrial Microbiology and Biotechnology.

[131]  B. Charlesworth Measures of divergence between populations and the effect of forces that reduce variability. , 1998, Molecular biology and evolution.

[132]  J. Ott Genetic data analysis II , 1997 .

[133]  E. Jacobsen,et al.  Use of allele specificity of comigrating AFLP markers to align genetic maps from different potato genotypes , 1997, Molecular and General Genetics MGG.

[134]  F. Rousset Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. , 1997, Genetics.

[135]  F. Yeh Population genetic analysis of codominant and dominant markers and quantitative traits. , 1997 .

[136]  M. Beaumont,et al.  Evaluating loci for use in the genetic analysis of population structure , 1996, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[137]  Laurent Excoffier,et al.  Assessing population genetic structure and variability with RAPD data: Application to Vaccinium macrocarpon (American Cranberry) , 1996 .

[138]  P. Vos,et al.  AFLP: a new technique for DNA fingerprinting. , 1995, Nucleic acids research.

[139]  Yves Van de Peer,et al.  TREECON for Windows: a software package for the construction and drawing of evolutionary trees for the Microsoft Windows environment , 1994, Comput. Appl. Biosci..

[140]  M. Lynch,et al.  Analysis of population genetic structure with RAPD markers , 1994, Molecular ecology.

[141]  D. Labuda,et al.  Genome fingerprinting by simple sequence repeat (SSR)-anchored polymerase chain reaction amplification. , 1994, Genomics.

[142]  A. Clark,et al.  Prospects for estimating nucleotide divergence with RAPDs. , 1993, Molecular biology and evolution.

[143]  L. Excoffier,et al.  Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. , 1992, Genetics.

[144]  K. Livak,et al.  DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. , 1990, Nucleic acids research.

[145]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[146]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[147]  R. Sokal,et al.  Detecting Regions of Abrupt Change in Maps of Biological Variables , 1989 .

[148]  M. Nei Molecular Evolutionary Genetics , 1987 .

[149]  B. Weir,et al.  ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE , 1984, Evolution; international journal of organic evolution.

[150]  M. Nei,et al.  Mathematical model for studying genetic variation in terms of restriction endonucleases. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[151]  M. Nei,et al.  Estimation of average heterozygosity and genetic distance from a small number of individuals. , 1978, Genetics.

[152]  M. Nei Analysis of gene diversity in subdivided populations. , 1973, Proceedings of the National Academy of Sciences of the United States of America.

[153]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[154]  W. H. Womble,et al.  Differential systematics. , 1951, Science.

[155]  T. Sørensen,et al.  A method of establishing group of equal amplitude in plant sociobiology based on similarity of species content and its application to analyses of the vegetation on Danish commons , 1948 .

[156]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .