Estimation of Multilocus Haplotype Effects Using Weighted Penalised Log‐Likelihood: Analysis of Five Sequence Variations at the Cholesteryl Ester Transfer Protein Gene Locus

Direct analyses of haplotype effects can be used to identify those specific combinations of alleles that are associated with a specific phenotype. We introduce a method for direct haplotype analysis that solves two problems that arise when haplotypes are analysed in populations of unrelated subjects. Instead of assigning a single, most likely, haplotype pair to multiple heterozygous subjects, all haplotype pairs compatible with their genotype were determined and the posterior probabilities of these pairs were calculated using Bayes’ theorem and estimated haplotype frequencies. For the individual patients, all possible haplotype pairs were included in the statistical analysis using the posterior probabilities as weights, which were re‐estimated in an iterative process together with the haplotype effects. The second problem of unstable haplotype effect estimates, due to the numerous haplotypes and the low frequency at which some occur, was solved by assuming that haplotypes sharing the same alleles show a similar effect and that the extent of this similarity relates to the number of alleles shared. These assumptions were incorporated in a weighted log‐likelihood model by introducing a penalty, where differences in effects of similar haplotypes were penalised. Using CETP gene haplotypes, consisting of five closely linked polymorphisms, and baseline CETP and HDL‐C concentrations from the REGRESS population, we demonstrated that the model resulted in more stable effects than estimates based on unambiguous patients only.

[1]  M. Oudshoorn,et al.  Validation of haplotype frequency estimation methods. , 1998, Human immunology.

[2]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[3]  A. Templeton The general relationship between average effect and average excess. , 1987, Genetical research.

[4]  D. Gordon,et al.  High-density lipoprotein--the clinical implications of recent studies. , 1989, The New England journal of medicine.

[5]  R. Lawn,et al.  Multiple RFLPs at the human cholesteryl ester transfer protein (CETP) locus. , 1987, Nucleic acids research.

[6]  Dmitri V. Zaykin,et al.  Effectiveness of computational methods in haplotype prediction , 2002, Human Genetics.

[7]  A. Zwinderman,et al.  Haplotype analysis of the CETP gene: not TaqIB, but the closely linked -629C-->A polymorphism and a novel promoter variant are independently associated with CETP concentration. , 2003, Human molecular genetics.

[8]  C. Sing,et al.  A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. IV. Nested analyses with cladogram uncertainty and recombination. , 1993, Genetics.

[9]  J. Boer,et al.  Heterogeneity at the CETP gene locus. Influence on plasma CETP concentrations and HDL cholesterol levels. , 1997, Arteriosclerosis, thrombosis, and vascular biology.

[10]  F. Aichner,et al.  Relationship of high-density lipoprotein subfractions and cholesteryl ester transfer protein in plasma to carotid artery wall thickness , 1995, Journal of Molecular Medicine.

[11]  Jill A Hollenbach,et al.  Haplotype frequency estimation in patient populations: The effect of departures from Hardy‐Weinberg proportions and collapsing over a locus in the HLA region , 2002, Genetic epidemiology.

[12]  L Tiret,et al.  Extensive association analysis between the CETP gene and coronary heart disease phenotypes reveals several putative functional polymorphisms and gene‐environment interaction , 2000, Genetic epidemiology.

[13]  V. Gudnason,et al.  Cholesteryl ester transfer protein gene effect on CETP activity and plasma high‐density lipoprotein in European populations , 1999, European journal of clinical investigation.

[14]  A. Albers,et al.  Comparison of improved precipitation methods for quantification of high-density lipoprotein cholesterol. , 1985, Clinical chemistry.

[15]  C. Packard,et al.  Polymorphisms in the gene coding for cholesteryl ester transfer protein are related to plasma high-density lipoprotein cholesterol and transfer protein activity. , 1990, Clinical science.

[16]  L. Excoffier,et al.  Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. , 1995, Molecular biology and evolution.

[17]  C. Sing,et al.  A cladistic analysis of phenotype associations with haplotypes inferred from restriction endonuclease mapping. II. The analysis of natural populations. , 1988, Genetics.

[18]  E. Boerwinkle,et al.  Analysis of lipoprotein lipase haplotypes reveals associations not apparent from analysis of the constituent loci , 1999 .

[19]  A. Algra,et al.  Effect of intensive lipid-lowering strategy on low-density lipoprotein particle size in patients with type 2 diabetes mellitus. , 2001, Atherosclerosis.

[20]  E. Boerwinkle,et al.  A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. I. Basic theory and an analysis of alcohol dehydrogenase activity in Drosophila. , 1987, Genetics.

[21]  M. Elchebly,et al.  Association between plasma HDL-cholesterol concentration and Taq1B CETP gene polymorphism in non-insulin-dependent diabetes mellitus. , 1998, Journal of lipid research.

[22]  A. Clark,et al.  Inference of haplotypes from PCR-amplified samples of diploid populations. , 1990, Molecular biology and evolution.

[23]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.

[24]  Laurent Excoffier,et al.  Testing for linkage disequilibrium in genotypic data using the Expectation-Maximization algorithm , 1996, Heredity.

[25]  R. Royall Model robust confidence intervals using maximum likelihood estimators , 1986 .

[26]  E. Thompson,et al.  Performing the exact test of Hardy-Weinberg proportion for multiple alleles. , 1992, Biometrics.

[27]  J. Long,et al.  An E-M algorithm and testing strategy for multiple-locus haplotypes. , 1995, American journal of human genetics.

[28]  Paul H. C. Eilers,et al.  Flexible smoothing with B-splines and penalties , 1996 .

[29]  Bayes Estimates of Haplotype Effects , 2001, Genetic epidemiology.

[30]  E. Boerwinkle,et al.  The effect of variation in the apolipoprotein B gene on plasma lipid and apolipoprotein B levels I. A likelihood‐based approach to cladistic analysis , 1994, Annals of human genetics.

[31]  F. Cambien,et al.  New functional promoter polymorphism, CETP/-629, in cholesteryl ester transfer protein (CETP) gene related to CETP mass and high density lipoprotein cholesterol levels: role of Sp1/Sp3 in transcriptional regulation. , 2000, Arteriosclerosis, thrombosis, and vascular biology.

[32]  G J Boerma,et al.  Effects of lipid lowering by pravastatin on progression and regression of coronary artery disease in symptomatic men with normal to moderately elevated serum cholesterol levels. The Regression Growth Evaluation Statin Study (REGRESS). , 1995, Circulation.

[33]  N. Schork,et al.  Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. , 2000, American journal of human genetics.

[34]  Jean-Louis Golmard,et al.  Specific haplotypes of the P-selectin gene are associated with myocardial infarction. , 2002, Human molecular genetics.