A Confidence Set Inference Method for Identifying SNPs That Regulate Quantitative Phenotypes

Aims: We introduce a family-based confidence set inference (CSI) method that can be used in preliminary genome-wide association studies to obtain confidence sets of SNPs that contribute a specific percentage to the additive genetic variance of quantitative traits. Methods: Developed in the framework of generalized linear mixed models, the method utilizes data from outbred families of arbitrary size and structure. Through our own simulation study and analysis of the Genetics Analysis Workshop 16 simulated data, we study the properties of our method and compare its performance to that of the family association method described by Chen and Abecasis [Am J Hum Genet 2007;81:913–926]. We also analyze the Framingham Heart Study data to identify SNPs regulating high-density lipoprotein levels. Results: The simulation studies demonstrated that CSI yields confidence sets with correct coverage and that it can outperform the method introduced by Chen and Abecasis [Am J Hum Genet 2007;81:913–926]. Furthermore, we identified five SNPs that potentially regulate high-density lipoprotein levels: rs9989419, rs11586238, rs1754415, rs9355648, and rs9356560. Conclusion: The CSI method provides confidence sets of SNPs that contribute to the genetic variance of quantitative traits and is a competitive alternative to currently used family association methods. The approach is particularly useful in genome-wide association studies as it significantly reduces the number of SNPs investigated in follow-up studies.

[1]  C. Gieger,et al.  Genome-Wide Association Analysis of High-Density Lipoprotein Cholesterol in the Population-Based KORA Study Sheds New Light on Intergenic Regions , 2008, Circulation. Cardiovascular genetics.

[2]  A. Jacquard The Genetic Structure of Populations , 1974 .

[3]  G. Abecasis,et al.  Family-based association tests for genomewide association scans. , 2007, American journal of human genetics.

[4]  S. Bull,et al.  Tests for the presence of two linked disease susceptibility genes , 2005, Genetic epidemiology.

[5]  Vip Viprakasit,et al.  A Regulatory SNP Causes a Human Genetic Disease by Creating a New Transcriptional Promoter , 2006, Science.

[6]  Shili Lin,et al.  A comparison of methods for intermediate fine mapping , 2006, Genetic epidemiology.

[7]  M S McPeek,et al.  Estimation of variance components of quantitative traits in inbred populations. , 2000, American journal of human genetics.

[8]  Ola Hössjer,et al.  Assessing accuracy in linkage analysis by means of confidence regions , 2003, Genetic epidemiology.

[9]  Laurent Briollais,et al.  Fine mapping by linkage and association in nuclear family and case‐control designs , 2005, Genetic epidemiology.

[10]  Shelley B Bull,et al.  Were genome‐wide linkage studies a waste of time? Exploiting candidate regions within genome‐wide association studies , 2010, Genetic epidemiology.

[11]  Ingrid B Borecki,et al.  The Genetic Analysis Workshop 16 Problem 3: simulation of heritable longitudinal cardiovascular phenotypes based on actual genome-wide single-nucleotide polymorphisms in the Framingham Heart Study , 2009, BMC proceedings.

[12]  L. Almasy,et al.  Evidence for bivariate linkage of obesity and HDL-C levels in the Framingham Heart Study , 2003, BMC Genetics.

[13]  L. Almasy,et al.  Toward the identification of causal genes in complex diseases: a gene-centric joint test of significance combining genomic and transcriptomic data , 2009, BMC proceedings.

[14]  Stephen R. Piccolo,et al.  Evaluation of genetic risk scores for lipid levels using genome-wide markers in the Framingham Heart Study , 2009, BMC proceedings.

[15]  Nan Hu,et al.  Whole genome-wide association study using affymetrix SNP chip: a two-stage sequential selection method to identify genes that increase the risk of developing complex diseases. , 2008, Methods in molecular medicine.

[16]  S WRIGHT,et al.  Genetical structure of populations. , 1950, Nature.

[17]  E. Wijsman,et al.  Joint linkage and segregation analysis under multiallelic trait inheritance: simplifying interpretations for complex traits , 2010, Genetic epidemiology.

[18]  S. Jain,et al.  GENETIC STRUCTURE OF POPULATIONS. , 1975, Evolution; international journal of organic evolution.

[19]  K Y Liang,et al.  A Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping for Complex Diseases , 2000, Human Heredity.

[20]  Chris Haley,et al.  A comparison of bootstrap methods to construct confidence intervals in QTL mapping , 1998 .

[21]  Daniel J Schaid,et al.  Robust multipoint identical-by-descent mapping for affected relative pairs. , 2005, American journal of human genetics.

[22]  G. Kitas,et al.  Rheumatoid arthritis susceptibility genes associate with lipid levels in patients with rheumatoid arthritis , 2011, Annals of the rheumatic diseases.

[23]  R. Elston,et al.  Bayesian intervals for linkage locations , 2009, Genetic epidemiology.

[24]  R. D'Agostino,et al.  A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study , 2007, BMC Medical Genetics.

[25]  David B. Goldstein,et al.  Rare Variants Create Synthetic Genome-Wide Associations , 2010, PLoS biology.

[26]  J. Ott Analysis of Human Genetic Linkage , 1985 .

[27]  C. Papachristou Confidence set of putative quantitative trait loci in whole genome scans with application to the Genetic Analysis Workshop 17 simulated data , 2011, BMC proceedings.

[28]  J. Ott,et al.  Complement Factor H Polymorphism in Age-Related Macular Degeneration , 2005, Science.

[29]  R. Elston,et al.  The investigation of linkage between a quantitative trait and a marker locus , 1972, Behavior genetics.

[30]  V. Vieland,et al.  Adequacy of single-locus approximations for linkage analyses of oligogenic traits: extension to multigenerational pedigree structures. , 1993, Human heredity.

[31]  Teri A Manolio,et al.  Genomewide association studies and assessment of the risk of disease. , 2010, The New England journal of medicine.

[32]  Inês Barroso,et al.  Genetic Variants Influencing Circulating Lipid Levels and Risk of Coronary Artery Disease , 2010, Arteriosclerosis, thrombosis, and vascular biology.

[33]  A. Peljto,et al.  Multiple Subsampling of Dense SNP Data Localizes Disease Genes with Increased Precision , 2009, Human Heredity.

[34]  Lucia Mirea,et al.  One-stage design is empirically more powerful than two-stage design for family-based genome-wide association studies , 2007, BMC proceedings.

[35]  J W R Twisk,et al.  Lipids and inflammation: serial measurements of the lipid profile of blood donors who later developed rheumatoid arthritis , 2006, Annals of the rheumatic diseases.

[36]  J. V. Ooijen,et al.  Accuracy of mapping quantitative trait loci in autogamous species , 1992, Theoretical and Applied Genetics.

[37]  S. Leal,et al.  SimPed: A Simulation Program to Generate Haplotype and Genotype Data for Pedigree Structures , 2005, Human Heredity.

[38]  E. Lander,et al.  Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. , 1989, Genetics.

[39]  V. Vieland,et al.  Adequacy of single‐locus approximations for linkage analyses of oligogenic traits , 1992, Genetic epidemiology.

[40]  B. Mangin,et al.  Constructing confidence intervals for QTL location. , 1994, Genetics.

[41]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[42]  J. Bennewitz,et al.  Improved confidence intervals in quantitative trait loci mapping by permutation bootstrapping. , 2002, Genetics.

[43]  I. Borecki,et al.  Genetic and Genomic Discovery Using Family Studies , 2008, Circulation.

[44]  M. Daly,et al.  Genetic variants at CD28, PRDM1, and CD2/CD58 are associated with rheumatoid arthritis risk , 2009, Nature Genetics.

[45]  Luigi Ferrucci,et al.  Genome-wide association analysis of total cholesterol and high-density lipoprotein cholesterol levels using the Framingham Heart Study data , 2010, BMC Medical Genetics.