Accurate liability estimation improves power in ascertained case-control studies

Linear mixed models (LMMs) have emerged as the method of choice for confounded genome-wide association studies. However, the performance of LMMs in nonrandomly ascertained case-control studies deteriorates with increasing sample size. We propose a framework called LEAP (liability estimator as a phenotype; https://github.com/omerwe/LEAP) that tests for association with estimated latent values corresponding to severity of phenotype, and we demonstrate that this can lead to a substantial power increase.

[1]  James T. Elder,et al.  Identification of fifteen new psoriasis susceptibility loci highlights the role of innate immunity , 2012, Nature Genetics.

[2]  P. Visscher,et al.  Estimating missing heritability for disease from genome-wide association studies. , 2011, American journal of human genetics.

[3]  S WRIGHT,et al.  Genetical structure of populations. , 1950, Nature.

[4]  Ole A. Andreassen,et al.  The Impact of Divergence Time on the Nature of Population Structure: An Example from Iceland , 2009, PLoS genetics.

[5]  P. Visscher,et al.  Increased accuracy of artificial selection by using the realized relationship matrix. , 2009, Genetics research.

[6]  D. Clayton,et al.  Link Functions in Multi-Locus Genetic Models: Implications for Testing, Prediction, and Interpretation , 2012, Genetic epidemiology.

[7]  K. Roeder,et al.  Genomic Control for Association Studies , 1999, Biometrics.

[8]  M. Pirinen,et al.  Including known covariates can reduce power to detect genetic effects in case-control studies , 2012, Nature Genetics.

[9]  Alkes L. Price,et al.  New approaches to population stratification in genome-wide association studies , 2010, Nature Reviews Genetics.

[10]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[11]  M. Pirinen,et al.  Common variants in the HLA-DRB1-HLA-DQA1 Class II region are associated with susceptibility to visceral leishmaniasis , 2013, Nature Genetics.

[12]  Saharon Rosset,et al.  Narrowing the gap on heritability of common disease by direct estimation in case-control GWAS , 2013, 1305.5363.

[13]  D. Altshuler,et al.  Informed Conditioning on Clinical Covariates Increases Power in Case-Control Association Studies , 2012, PLoS genetics.

[14]  Saharon Rosset,et al.  Effective genetic-risk prediction using mixed models. , 2014, American journal of human genetics.

[15]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[16]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[17]  D. Heckerman,et al.  Efficient Control of Population Structure in Model Organism Association Mapping , 2008, Genetics.

[18]  Neil D. Lawrence,et al.  Warped linear mixed models for the genetic analysis of transformed phenotypes , 2014, Nature Communications.

[19]  Ying Liu,et al.  FaST linear mixed models for genome-wide association studies , 2011, Nature Methods.

[20]  Xiang Zhou,et al.  Polygenic Modeling with Bayesian Sparse Linear Mixed Models , 2012, PLoS genetics.

[21]  J. Mefford,et al.  The Covariate's Dilemma , 2012, PLoS genetics.

[22]  Mark I McCarthy,et al.  Genomic inflation factors under polygenic inheritance , 2011, European Journal of Human Genetics.

[23]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[24]  Simon C. Potter,et al.  Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis , 2011, Nature.

[25]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[26]  P. Donnelly,et al.  Genome-wide association study of ulcerative colitis identifies three new susceptibility loci, including the HNF4A region , 2010 .

[27]  Neil D. Lawrence,et al.  Genetic Analysis of Transformed Phenotypes , 2014, 1402.5447.

[28]  D. Heckerman,et al.  Further Improvements to Linear Mixed Models for Genome-Wide Association Studies , 2014, Scientific Reports.

[29]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[30]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[31]  Eleazar Eskin,et al.  Improved linear mixed models for genome-wide association studies , 2012, Nature Methods.

[32]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[33]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[34]  Peter Kraft,et al.  Analysis of case-control association studies with known risk variants , 2012, Bioinform..

[35]  E. Dempster,et al.  Heritability of Threshold Characters. , 1950, Genetics.

[36]  D. Balding A tutorial on statistical methods for population association studies , 2006, Nature Reviews Genetics.

[37]  Greg Gibson,et al.  Rare and common variants: twenty arguments , 2012, Nature Reviews Genetics.

[38]  S. Rosset,et al.  Measuring missing heritability: Inferring the contribution of common variants , 2014, Proceedings of the National Academy of Sciences.

[39]  P. Visscher,et al.  Advantages and pitfalls in the application of mixed-model association methods , 2014, Nature Genetics.

[40]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[41]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[42]  Richard A. Nichols,et al.  A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity , 2008, Genetica.