Variance Estimation and Confidence Intervals from Genome-wide Association Studies Through High-dimensional Misspecified Mixed Model Analysis.

We study variance estimation and associated confidence intervals for parameters characterizing genetic effects from genome-wide association studies (GWAS) misspecified mixed model analysis. Previous studies have shown that, in spite of the model misspecification, certain quantities of genetic interests are estimable, and consistent estimators of these quantities can be obtained using the restricted maximum likelihood (REML) method under a misspecified linear mixed model. However, the asymptotic variance of such a REML estimator is complicated and not ready to be implemented for practical use. In this paper, we develop practical and computationally convenient methods for estimating such asymptotic variances and constructing the associated confidence intervals. Performance of the proposed methods is evaluated empirically based on Monte-Carlo simulations and real-data application.

[1]  Sharon L. Lohr,et al.  Comparison of confidence intervals for variance components with unbalanced data , 1997 .

[2]  Xiang Zhou,et al.  Polygenic Modeling with Bayesian Sparse Linear Mixed Models , 2012, PLoS genetics.

[3]  Can Yang,et al.  On high-dimensional misspecified mixed model analysis in genome-wide association study , 2016 .

[4]  Jiming Jiang Linear and Generalized Linear Mixed Models and Their Applications , 2007 .

[5]  Donald E. Myers,et al.  Linear and Generalized Linear Mixed Models and Their Applications , 2008, Technometrics.

[6]  D. Heckerman,et al.  Linear mixed model for heritability estimation that explicitly addresses environmental variation , 2016, Proceedings of the National Academy of Sciences.

[7]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[8]  Jiming Jiang Large Sample Techniques for Statistics , 2010, Springer Texts in Statistics.

[9]  Rasool Tahmasbi,et al.  Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits , 2017, Nature Genetics.

[10]  P. Visscher,et al.  GCTA: a tool for genome-wide complex trait analysis. , 2011, American journal of human genetics.

[11]  Shripad Tuljapurkar,et al.  Limitations of GCTA as a solution to the missing heritability problem , 2015, Proceedings of the National Academy of Sciences.

[12]  Doug Speed,et al.  Improved heritability estimation from genome-wide SNPs. , 2012, American journal of human genetics.

[13]  P. Visscher,et al.  Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index , 2015, Nature Genetics.

[14]  N. Patterson,et al.  Using Extended Genealogy to Estimate Components of Heritability for 23 Quantitative and Dichotomous Traits , 2013, PLoS genetics.

[15]  Xiaoping Zhou,et al.  Statistical methods for SNP heritability estimation and partition: A review , 2020, Computational and structural biotechnology journal.

[16]  M. Daly,et al.  LD Score regression distinguishes confounding from polygenicity in genome-wide association studies , 2014, Nature Genetics.

[17]  Brent D. Burch Comparing pivotal and REML-based confidence intervals for heritability , 2007 .

[18]  Josyf Mychaleckyj,et al.  Robust relationship inference in genome-wide association studies , 2010, Bioinform..

[19]  Doug Speed,et al.  Evaluating and improving heritability models using summary statistics , 2019, Nature Genetics.

[20]  P. Visscher,et al.  10 Years of GWAS Discovery: Biology, Function, and Translation. , 2017, American journal of human genetics.

[21]  Doug Speed,et al.  Re-evaluation of SNP heritability in complex human traits , 2016, Nature Genetics.

[22]  P. Donnelly,et al.  The UK Biobank resource with deep phenotyping and genomic data , 2018, Nature.

[23]  S. Rosset,et al.  Measuring missing heritability: Inferring the contribution of common variants , 2014, Proceedings of the National Academy of Sciences.

[24]  W. G. Hill,et al.  Genome partitioning of genetic variation for complex traits using common SNPs , 2011, Nature Genetics.