Haplotype Inference and Its Application in Linkage Disequilibrium Mapping

Haplotypes, defined as a set of DNA polymorphism markers physically located on a single chromosome, have gained exploding magnitude of interest owing to its potential value in disease gene identification and in pharmacogenomics. Because molecular haplotyping methods remain too costly to be used on a regular basis, statistical techniques for haplotype inference have emerged as the most time- and cost-efficient approach. This chapter explains the statistical theory and algorithms behind several in silico haplotype phasing strategies; reviews the partition-ligation idea for dealing with a large number of linked SNP marker loci; and proposes new methods for handling genotype uncertainty in the genotyping machine output as well as the pooled marker data. We also discuss the application of haplotype information in disease mutation detection in case-control designs and the impact of haplotype information on locus estimation accuracy. As an illustration, we applied the haplotyping tool PL-EM jointly with the LD mapping algorithm BLADE to a case-control study of the SNP markers surrounding the Alzheimer disease susceptible gene APOE.

[1]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[2]  Jun S. Liu,et al.  Parameter Expansion for Data Augmentation , 1999 .

[3]  Katherine M Kirk,et al.  The impact of genotyping error on haplotype reconstruction and frequency estimation , 2002, European Journal of Human Genetics.

[4]  P Sham,et al.  Shifting paradigms in gene-mapping methodology for complex traits. , 2001, Pharmacogenomics.

[5]  M. McPeek,et al.  Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. , 1999, American journal of human genetics.

[6]  K. Kidd,et al.  HAPLO: a program using the EM algorithm to estimate the frequencies of multi-site haplotypes. , 1995, The Journal of heredity.

[7]  Stefan Kammerer,et al.  Association testing by DNA pooling: An effective initial screen , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[8]  L. Excoffier,et al.  Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. , 1995, Molecular biology and evolution.

[9]  D G Clayton,et al.  Fine genetic mapping using haplotype analysis and the missing data problem , 1998, Annals of human genetics.

[10]  N Risch,et al.  Strong allelic association between the torsion dystonia gene (DYT1) andloci on chromosome 9q34 in Ashkenazi Jews. , 1992, American journal of human genetics.

[11]  Eric Lander,et al.  Linkage disequilibrium mapping in isolated founder populations: diastrophic dysplasia in Finland , 1992, Nature Genetics.

[12]  C. Sabatti,et al.  Bayesian analysis of haplotypes for linkage disequilibrium mapping. , 2001, Genome research.

[13]  L. Tsui,et al.  Erratum: Identification of the Cystic Fibrosis Gene: Genetic Analysis , 1989, Science.

[14]  Peter Beighton,et al.  de la Chapelle, A. , 1997 .

[15]  B Rannala,et al.  Likelihood analysis of disequilibrium mapping, and related problems. , 1998, American journal of human genetics.

[16]  Mark Leppert,et al.  Positional Cloning of the Human Quantitative Trait Locus Underlying Taste Sensitivity to Phenylthiocarbamide , 2003, Science.

[17]  S. Tishkoff,et al.  Molecular haplotyping of genetic markers 10 kb apart by allele-specific long-range PCR. , 1996, Nucleic acids research.

[18]  Tianhua Niu,et al.  Haplotype information and linkage disequilibrium mapping for single nucleotide polymorphisms. , 2003, Genome research.

[19]  K Roeder,et al.  Haplotype fine mapping by evolutionary trees. , 2000, American journal of human genetics.

[20]  Zhaohui S. Qin,et al.  Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. , 2002, American journal of human genetics.

[21]  J. Gilbert,et al.  SNPing away at complex diseases: analysis of single-nucleotide polymorphisms around APOE in Alzheimer disease. , 2000, American journal of human genetics.

[22]  D J Balding,et al.  Bayesian fine-scale mapping of disease loci, by hidden Markov models. , 2000, American journal of human genetics.

[23]  M. Xiong,et al.  Fine-scale genetic mapping based on linkage disequilibrium: theory and applications. , 1997, American journal of human genetics.

[24]  Jun S. Liu,et al.  The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[25]  E A Thompson,et al.  Disequilibrium likelihoods for fine-scale mapping of a rare allele. , 1998, American journal of human genetics.

[26]  Frank Dudbridge,et al.  Haplotype tagging for the identification of common disease genes , 2001, Nature Genetics.

[27]  Jun S. Liu,et al.  Predictive updating methods with application to Bayesian classification , 1996 .

[28]  Zhaohui S. Qin,et al.  Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. , 2002, American journal of human genetics.

[29]  M. Xiong,et al.  The effect that genotyping errors have on the robustness of common linkage-disequilibrium measures. , 2001, American journal of human genetics.

[30]  Toshikazu Ito,et al.  Estimation of haplotype frequencies, linkage-disequilibrium measures, and combination of haplotype copies in each pool by use of pooled DNA data. , 2003, American journal of human genetics.

[31]  J. Long,et al.  An E-M algorithm and testing strategy for multiple-locus haplotypes. , 1995, American journal of human genetics.

[32]  B S Weir,et al.  Likelihood methods for locating disease genes in nonequilibrium populations. , 1995, American journal of human genetics.

[33]  Jacques S. Beckmann,et al.  Resolution of haplotypes and haplotype frequencies from SNP genotypes of pooled samples , 2003, RECOMB '03.

[34]  S. P. Fodor,et al.  Blocks of Limited Haplotype Diversity Revealed by High-Resolution Scanning of Human Chromosome 21 , 2001, Science.

[35]  A. Clark,et al.  Inference of haplotypes from PCR-amplified samples of diploid populations. , 1990, Molecular biology and evolution.

[36]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.

[37]  Tom H. Lindner,et al.  Genetic variation in the gene encoding calpain-10 is associated with type 2 diabetes mellitus , 2000, Nature Genetics.

[38]  M. Boehnke,et al.  Experimentally-derived haplotypes substantially increase the efficiency of linkage disequilibrium studies , 2001, Nature Genetics.

[39]  Xiao-Li Meng,et al.  The Art of Data Augmentation , 2001 .

[40]  Hongyu Zhao,et al.  On the use of DNA pooling to estimate haplotype frequencies , 2003, Genetic epidemiology.

[41]  Bruce Rannala,et al.  Methods for multipoint disease mapping using linkage disequilibrium , 2000, Genetic epidemiology.

[42]  L. Kruglyak,et al.  Patterns of linkage disequilibrium in the human genome , 2002, Nature Reviews Genetics.

[43]  Steuart Rorke,et al.  Association of the ADAM33 gene with asthma and bronchial hyperresponsiveness , 2002, Nature.

[44]  N C Dracopoli,et al.  Progress in high throughput SNP genotyping methods , 2002, The Pharmacogenomics Journal.

[45]  M. Waterman,et al.  A dynamic programming algorithm for haplotype block partitioning , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[46]  B. Shastry,et al.  SNP alleles in human disease and evolution , 2002, Journal of Human Genetics.

[47]  Richard J. Mural,et al.  Genome-wide single-nucleotide polymorphism analysis defines haplotype patterns in mouse , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .