Modelling Linkage Disequilibrium , And Identifying Recombination Hotspots Using SNP Data

We introduce a new statistical model for patterns of Linkage Disequilibrium (LD) among multiple SNPs in a population sample. The model overcomes limitations of existing approaches to understanding, summarizing, and interpreting LD by (i) relating patterns of LD directly to the underlying recombination process; (ii) considering all loci simultaneously, rather than pairwise; (iii) avoiding the assumption that LD necessarily has a “block-like” structure; and (iv) being computationally tractable for huge genomic regions (up to complete chromosomes). We examine in detail one natural application of the model: estimation of underlying recombination rates from population data. Using simulation, we show that in the case where recombination is assumed constant across the region of interest, recombination rate estimates based on our model are competitive with the very best of current available methods. More importantly, we demonstrate, on real and simulated data, the potential of the model to help identify and quantify fine-scale variation in recombination rate from population data. We also outline how the model could be useful in other contexts, such as in the development of more efficient haplotype-based methods for LD mapping.

[1]  W. Ewens The sampling theory of selectively neutral alleles. , 1972, Theoretical population biology.

[2]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[3]  R. Hudson Properties of a neutral allele model with intragenic recombination. , 1983, Theoretical population biology.

[4]  R. Hudson,et al.  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. , 1985, Genetics.

[5]  R. Hudson,et al.  Estimating the recombination parameter of a finite population model without selection. , 1987, Genetical research.

[6]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[7]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8]  A. Clark,et al.  Inference of haplotypes from PCR-amplified samples of diploid populations. , 1990, Molecular biology and evolution.

[9]  P. Marjoram,et al.  Ancestral Inference from Samples of DNA Sequences with Recombination , 1996, J. Comput. Biol..

[10]  J. Wakeley Using the variance of pairwise differences to estimate the recombination rate. , 1997, Genetical research.

[11]  R. Griffiths,et al.  Archaic African and Asian lineages in the genetic ancestry of modern humans. , 1997, American journal of human genetics.

[12]  J. Wakeley,et al.  A coalescent estimator of the population recombination rate. , 1997, Genetics.

[13]  M. Hammer,et al.  Out of Africa and back again: nested cladistic analysis of human Y chromosome variation. , 1998, Molecular biology and evolution.

[14]  E. Boerwinkle,et al.  Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. , 1998, American journal of human genetics.

[15]  E. Boerwinkle,et al.  DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene , 1998, Nature Genetics.

[16]  M. McPeek,et al.  Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. , 1999, American journal of human genetics.

[17]  L. Kruglyak Prospects for whole-genome linkage disequilibrium mapping of common disease genes , 1999, Nature Genetics.

[18]  P. Donnelly,et al.  Inference in molecular population genetics , 2000 .

[19]  A. Jeffreys,et al.  High resolution analysis of haplotype diversity and meiotic crossover in the human TAP2 recombination hotspot. , 2000, Human molecular genetics.

[20]  R. Nielsen Estimation of population parameters and recombination rates from single nucleotide polymorphisms. , 2000, Genetics.

[21]  Jon A Yamato,et al.  Maximum likelihood estimation of recombination rates from population data. , 2000, Genetics.

[22]  J. Wall,et al.  A comparison of estimators of the population recombination rate. , 2000, Molecular biology and evolution.

[23]  D J Balding,et al.  Bayesian fine-scale mapping of disease loci, by hidden Markov models. , 2000, American journal of human genetics.

[24]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[25]  J. Pritchard,et al.  Linkage disequilibrium in humans: models and data. , 2001, American journal of human genetics.

[26]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.

[27]  R. Hudson Two-locus sampling distributions and their application. , 2001, Genetics.

[28]  C. Sabatti,et al.  Bayesian analysis of haplotypes for linkage disequilibrium mapping. , 2001, Genome research.

[29]  A. Jeffreys,et al.  Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex , 2001, Nature Genetics.

[30]  M. Daly,et al.  High-resolution haplotype structure in the human genome , 2001, Nature Genetics.

[31]  Frank Dudbridge,et al.  Haplotype tagging for the identification of common disease genes , 2001, Nature Genetics.

[32]  D. Cox,et al.  Complex high-resolution linkage disequilibrium and haplotype patterns of single-nucleotide polymorphisms in 2.5 Mb of sequence on human chromosome 21. , 2001, Genomics.

[33]  J. Wall,et al.  Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. , 2001, American journal of human genetics.

[34]  P. Donnelly,et al.  Approximate likelihood methods for estimating local recombination rates , 2002 .

[35]  J. Akey,et al.  Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. , 2002, American journal of human genetics.

[36]  A. Jeffreys,et al.  Reciprocal crossover asymmetry and meiotic drive in a human recombination hot spot , 2002, Nature Genetics.

[37]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[38]  P. Fearnhead,et al.  A coalescent-based method for detecting and estimating recombination from gene sequences. , 2002, Genetics.

[39]  G. McVean,et al.  Estimating recombination rates from population-genetic data , 2003, Nature Reviews Genetics.