A novel approach for haplotype-based association analysis using family data

BackgroundHaplotype-based approaches have been extensively studied for case-control association mapping in recent years. It has been shown that haplotype methods can provide more consistent results comparing to single-locus based approaches, especially in cases where causal variants are not typed. Improved power has been observed by clustering similar or rare haplotypes into groups to reduce the degrees of freedom of association tests. For family-based association studies, one commonly used strategy is Transmission Disequilibrium Tests (TDT), which examine the imbalanced transmission of alleles/haplotypes to affected and normal children. Many extensions have been developed to deal with general pedigrees and continuous traits.ResultsIn this paper, we propose a new haplotype-based association method for family data that is different from the TDT framework. Our approach (termed F_HapMiner) is based on our previous successful experiences on haplotype inference from pedigree data and haplotype-based association mapping. It first infers diplotype pairs of each individual in each pedigree assuming no recombination within a family. A phenotype score is then defined for each founder haplotype. Finally, F_HapMiner applies a clustering algorithm on those founder haplotypes based on their similarities and identifies haplotype clusters that show significant associations with diseases/traits. We have performed extensive simulations based on realistic assumptions to evaluate the effectiveness of the proposed approach by considering different factors such as allele frequency, linkage disequilibrium (LD) structure, disease model and sample size. Comparisons with single-locus and haplotype-based TDT methods demonstrate that our approach consistently outperforms the TDT-based approaches regardless of disease models, local LD structures or allele/haplotype frequencies.ConclusionWe present a novel haplotype-based association approach using family data. Experiment results demonstrate that it achieves significantly higher power than TDT-based approaches.

[1]  Tao Jiang,et al.  Efficient Inference of Haplotypes from Genotypes on a Pedigree , 2003, J. Bioinform. Comput. Biol..

[2]  L. Wasserman,et al.  On the identification of disease mutations by the analysis of haplotype similarity and goodness of fit. , 2003, American journal of human genetics.

[3]  E. Wijsman,et al.  Empirical significance values for linkage analysis: trait simulation using posterior model distributions from MCMC oligogenic segregation analysis , 2008, Genetic epidemiology.

[4]  P. Marjoram,et al.  Fine-scale mapping of disease genes with multiple mutations via spatial clustering techniques. , 2003, American journal of human genetics.

[5]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[6]  Xin Li,et al.  Detecting Genome-wide Haplotype Polymorphism by Combined Use of Mendelian Constraints and Local Population Structure , 2010, Pacific Symposium on Biocomputing.

[7]  M. McPeek,et al.  Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. , 1999, American journal of human genetics.

[8]  Hong-Wen Deng,et al.  The effects of selected sampling on the transmission disequilibrium test of a quantitative trait locus. , 2002, Genetical research.

[9]  Xin Li,et al.  An Almost Linear Time Algorithm for a General Haplotype Solution on Tree Pedigrees with no Recombination and its Extensions , 2009, J. Bioinform. Comput. Biol..

[10]  Xin Xu,et al.  Family‐based tests for associating haplotypes with general phenotype data: Application to asthma genetics , 2004, Genetic epidemiology.

[11]  P. Donnelly,et al.  Association mapping in structured populations. , 2000, American journal of human genetics.

[12]  H. Cann,et al.  Centre d'etude du polymorphisme humain (CEPH): collaborative genetic mapping of the human genome. , 1990, Genomics.

[13]  Tao Jiang,et al.  Genetics and population analysis Haplotype-based linkage disequilibrium mapping via direct data mining , 2005 .

[14]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[15]  Jing Li,et al.  Haplotype-based quantitative trait mapping using a clustering algorithm , 2006, BMC Bioinformatics.

[16]  L. Tsui,et al.  Identification of the cystic fibrosis gene: genetic analysis. , 1989, Science.

[17]  Na Li,et al.  Genetic Analysis Workshop 15: simulation of a complex genetic model for rheumatoid arthritis in nuclear families including a dense SNP map with linkage disequilibrium between marker loci and trait loci , 2007, BMC Proceedings.

[18]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[19]  Jing Li,et al.  Association Mapping by Generalized Linear Regression with Density-based Haplotype Clustering , 2022 .

[20]  Jianping Dong,et al.  Transmission/disequilibrium test based on haplotype sharing for tightly linked markers. , 2003, American journal of human genetics.

[21]  K. Roeder,et al.  Transmission/disequilibrium test meets measured haplotype analysis: family-based association analysis guided by evolution of haplotypes. , 2001, American journal of human genetics.

[22]  J. Kere,et al.  Data mining applied to linkage disequilibrium mapping. , 2000, American journal of human genetics.

[23]  Michael Knapp,et al.  Maximum‐likelihood estimation of haplotype frequencies in nuclear families , 2004, Genetic epidemiology.

[24]  Jing Li,et al.  Comparison of haplotyping methods using families and unrelated individuals on simulated rheumatoid arthritis data , 2007, BMC proceedings.

[25]  Tom Walsh,et al.  Ten genes for inherited breast cancer. , 2007, Cancer cell.

[26]  C. Sabatti,et al.  Bayesian analysis of haplotypes for linkage disequilibrium mapping. , 2001, Genome research.

[27]  Gary O Zerbe,et al.  Permutation‐based adjustments for the significance of partial regression coefficients in microarray data analysis , 2008, Genetic epidemiology.

[28]  Dajun Qian Haplotype sharing correlation analysis using family data: A comparison with family‐based association test in the presence of allelic heterogeneity , 2004, Genetic epidemiology.

[29]  R R Recker,et al.  Effect of polygenes on Xiong’s transmission disequilibrium test of a QTL in nuclear families with multiple children , 2001, Genetic epidemiology.

[30]  Andrew P Morris,et al.  Linkage disequilibrium mapping via cladistic analysis of single-nucleotide polymorphism haplotypes. , 2004, American journal of human genetics.