Mining Associations Between Genetic Markers, Phenotypes, and Covariates

We used Haplotype Pattern Mining, HPM [Toivonen et al., Am J Hum Genet 67:133–45, 2000], for gene localization in Genetic Analysis Workshop (GAW) 12 isolate data. In HPM, association is analyzed by searching all trait‐associated haplotype patterns. Data mining algorithms are utilized to make the search efficient. The strength of the haplotype‐trait associations is measured by a linear model, into which a pre‐selected set of covariates is incorporated. Marker‐wise patterns of association are used for predicting the disease gene location. Genome‐wide scans of susceptibility genes for affection status as well as for the quantitative traits (Q1–Q5) were performed. First analyses were made with small sample sizes, 63–94 trios per trait, which is compared with a pilot study of a larger complex disease‐mapping project. Subsequently, the analysis was repeated with approximately 600 cases and 600 controls per trait to give higher power to the analyses. With small sample sizes, only the susceptibility genes having the strongest effects on the traits could be localized. The larger sample size gave very good results: all susceptibility genes, except one, could be correctly localized. First experiments on candidate genes suggested that HPM is applicable even to fine mapping of mutations in DNA sequence. © 2001 Wiley‐Liss, Inc.

[1]  W. Ewens,et al.  Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). , 1993, American journal of human genetics.

[2]  L Kruglyak,et al.  Parametric and nonparametric linkage analysis: a unified multipoint approach. , 1996, American journal of human genetics.

[3]  A Chakravarti,et al.  Patterns of genetic variation in Mendelian and complex traits. , 2000, Annual review of genomics and human genetics.

[4]  J. Kere,et al.  Data mining applied to linkage disequilibrium mapping. , 2000, American journal of human genetics.