Regression-based multi-marker analysis for genome-wide association studies using haplotype similarity

Although haplotype analyses have been prevalent in genetic association studies, the main statistical analyses in genome-wide association (GWA) studies still mostly focus on singleSNP analyses. Several practical issues hinder practitioners’ enthusiasm to perform haplotypebased analysis under a GWA setting, including the need for large degrees of freedom and the treatment of missing phase information. To avoid these pitfalls, we propose a similarity-based regression model. It captures genetic variants via haplotype similarity to reduce the degrees of freedom, and uses phase-independent similarity measures to bypass the needs to impute phase information. We construct the score test to detect association between trait similarity and genetic similarity, and identify the limiting distribution of the score statistic. We show that the gene-trait similarity regression is closely connected with the random effects haplotype analysis although commonly they are considered as separate modeling tools in haplotype analysis. The proposed method is computationally efficient, allows for covariates and is applicable to both quantitative and binary traits. It serves as an effective tool for multi-marker analysis in genomewide association studies.

[1]  E. Génin,et al.  Use of closely related affected individuals for the genetic study of complex diseases in founder populations. , 2001, American journal of human genetics.

[2]  D. Schaid Evaluating associations of haplotypes with traits , 2004, Genetic epidemiology.

[3]  T. Matise,et al.  Identity-by-descent and association mapping of a recessive gene for Hirschsprung disease on human chromosome 13q22. , 1994, Human molecular genetics.

[4]  N. Freimer,et al.  Linkage-disequilibrium mapping of disease genes by reconstruction of ancestral haplotypes in founder populations. , 1999, American journal of human genetics.

[5]  S Purcell,et al.  Equivalence between Haseman-Elston and variance-components linkage analyses for sib pairs. , 2001, American journal of human genetics.

[6]  J. Wall,et al.  Assessing the performance of the haplotype block model of linkage disequilibrium. , 2003, American journal of human genetics.

[7]  G. T. te Meerman,et al.  Haplotype sharing analysis in affected individuals from nuclear families with at least one affected offspring , 1997, Genetic epidemiology.

[8]  M. C. Ellis,et al.  A novel MHC class I–like gene is mutated in patients with hereditary haemochromatosis , 1996, Nature Genetics.

[9]  J. Chang-Claude,et al.  Haplotype Sharing Analysis Using Mantel Statistics , 2005, Human Heredity.

[10]  A generalized estimating equations approach to linkage analysis in sibships in relation to multiple markers and exposure factors , 1999, Genetic epidemiology.

[11]  D. Schaid,et al.  Score tests for association between traits and haplotypes when linkage phase is ambiguous. , 2002, American journal of human genetics.

[12]  Nelson B. Freimer,et al.  Genome screening by searching for shared segments: mapping a gene for benign recurrent intrahepatic cholestasis , 1994, Nature Genetics.

[13]  J. P. BfflOF Computing the distribution of quadratic forms in normal variables , 2005 .

[14]  Glen A Satten,et al.  Statistical models for haplotype sharing in case-parent trio data. , 2007, Human heredity.

[15]  N. Schork,et al.  Generalized genomic distance-based regression methodology for multilocus association analysis. , 2006, American journal of human genetics.

[16]  Bayes Estimates of Haplotype Effects , 2001, Genetic epidemiology.

[17]  C Charles Gu,et al.  Genetic association mapping under founder heterogeneity via weighted haplotype similarity analysis in candidate genes , 2004, Genetic epidemiology.

[18]  D C Thomas,et al.  Genome Scan of Complex Traits by Haplotype Sharing Correlation , 2001, Genetic epidemiology.

[19]  R C Elston,et al.  Haseman and Elston revisited , 2000, Genetic epidemiology.

[20]  T. Meerman,et al.  Haplotype sharing analysis in affected individuals from nuclear families with at least one affected offspring , 1997 .

[21]  E. Feingold,et al.  Genome scanning for segments shared identical by descent among distant relatives in isolated populations. , 1997, American journal of human genetics.

[22]  E. Génin,et al.  Missing data in haplotype analysis: a study on the MILC method , 2002, Annals of human genetics.

[23]  M. McPeek,et al.  Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. , 1999, American journal of human genetics.

[24]  Qiuying Sha,et al.  A new association test using haplotype similarity , 2007, Genetic epidemiology.

[25]  Jung-Ying Tzeng,et al.  Haplotype-based association analysis via variance-components score test. , 2007, American journal of human genetics.

[26]  E. Génin,et al.  Search for multifactorial disease susceptibility genes in founder populations , 2000, Annals of human genetics.

[27]  L. Wasserman,et al.  On the identification of disease mutations by the analysis of haplotype similarity and goodness of fit. , 2003, American journal of human genetics.

[28]  Sonja W. Scholz,et al.  Genome-wide genotyping in amyotrophic lateral sclerosis and neurologically normal controls: first stage analysis and public release of data , 2007, The Lancet Neurology.