High density linkage disequilibrium mapping using models of haplotype block variation

MOTIVATION The presence of millions of single nucleotide polymorphisms (SNPs) in the human genome has spurred interest in genetic mapping methods based on linkage disequilibrium. The recently discovered haplotype block structure of human variation promises to improve the effectiveness of these methods. A key difficulty for mapping techniques is the cost involved in separately identifying the haplotypes on each of an individual's chromosomes. RESULTS We present a new approach for performing linkage disequilibrium mapping using high density haplotype or genotype data. Our method is based on a statistical model of haplotype block variation, which takes account of recombination hotspots, bottlenecks, genetic drift and mutation. We test our technique on two empirically determined high density datasets, attempting to recover the location of an SNP which was hidden and converted into phenotype information. We compare the results against a mapping method based on individual SNPs as well as a competing haplotype-based approach. We show that our strategy significantly outperforms these other approaches when used as a guide for resequencing and that it can also deal with both unphased genotype data and low penetrance diseases. AVAILABILITY HaploBlock executables for Linux, Mac OS X and Sun OS, as well as user documentation, are available online at http://bioinfo.cs.technion.ac.il/haploblock/

[1]  K Roeder,et al.  Haplotype fine mapping by evolutionary trees. , 2000, American journal of human genetics.

[2]  J. Novembre,et al.  Finding haplotype block boundaries by using the minimum-description-length principle. , 2003, American journal of human genetics.

[3]  Momiao Xiong,et al.  Randomly distributed crossovers may generate block-like patterns of linkage disequilibrium: an act of genetic drift , 2003, Human Genetics.

[4]  C. Sabatti,et al.  Bayesian analysis of haplotypes for linkage disequilibrium mapping. , 2001, Genome research.

[5]  Russell Schwartz,et al.  Robustness of Inference of Haplotype Block Structure , 2003, J. Comput. Biol..

[6]  D. Goldstein Islands of linkage disequilibrium , 2001, Nature Genetics.

[7]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[8]  Ruzong Fan,et al.  Genome association studies of complex diseases by case-control designs. , 2003, American journal of human genetics.

[9]  Rina Dechter,et al.  Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.

[10]  G. Abecasis,et al.  Using haplotype blocks to map human complex trait loci. , 2003, Trends in genetics : TIG.

[11]  J. Akey,et al.  Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. , 2002, American journal of human genetics.

[12]  Heikki Mannila,et al.  An MDL Method for Finding Haplotype Blocks and for Estimating the Strength of Haplotype Block Boundaries , 2002, Pacific Symposium on Biocomputing.

[13]  Fengzhu Sun,et al.  Haplotype block structure and its applications to association studies: power and study designs. , 2002, American journal of human genetics.

[14]  N. Risch Searching for genetic determinants in the new millennium , 2000, Nature.

[15]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[16]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[17]  A. Jeffreys,et al.  Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex , 2001, Nature Genetics.

[18]  Daniel Kahneman,et al.  Probabilistic reasoning , 1993 .

[19]  M. McPeek,et al.  Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. , 1999, American journal of human genetics.

[20]  Pui-Yan Kwok,et al.  Faculty Opinions recommendation of Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. , 2003 .

[21]  S. P. Fodor,et al.  Blocks of Limited Haplotype Diversity Revealed by High-Resolution Scanning of Human Chromosome 21 , 2001, Science.

[22]  Dan Geiger,et al.  Model-based inference of haplotype block variation , 2003, RECOMB '03.

[23]  D J Balding,et al.  Bayesian fine-scale mapping of disease loci, by hidden Markov models. , 2000, American journal of human genetics.

[24]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[25]  B. Rannala,et al.  High-resolution multipoint linkage-disequilibrium mapping in the context of a human genome sequence. , 2001, American journal of human genetics.

[26]  B. J. Carey,et al.  Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots , 2003, Nature Genetics.

[27]  A. Jeffreys,et al.  High resolution analysis of haplotype diversity and meiotic crossover in the human TAP2 recombination hotspot. , 2000, Human molecular genetics.

[28]  D. Botstein,et al.  Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease , 2003, Nature Genetics.

[29]  M. Daly,et al.  High-resolution haplotype structure in the human genome , 2001, Nature Genetics.

[30]  M. Waterman,et al.  A dynamic programming algorithm for haplotype block partitioning , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[31]  L. Cardon,et al.  Association study designs for complex diseases , 2001, Nature Reviews Genetics.

[32]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .