An expression-directed linear mixed model (edLMM) discovering low-effect genetic variants

Detecting genetic variants with low effect sizes using a moderate sample size is difficult, hindering downstream efforts to learn pathology and estimating heritability. In this work, by utilizing informative weights learned from training genetically predicted gene expression models, we formed an alternative approach to estimate the polygenic term in a linear mixed model (LMM). Our LMM estimates the genetic background by incorporating their relevance to gene expression. Our protocol, expression-directed linear mixed model (edLMM), enables the discovery of subtle signals of low-effect variants using moderate sample size. By applying edLMM to cohorts of around 5,000 individuals with either binary (WTCCC) or quantitative (NFBC1966) traits, we demonstrated its power gain at the low-effect end of the genetic etiology spectrum. In aggregate, the additional low-effect variants detected by edLMM substantially improved estimation of missing heritability. edLMM moves precision medicine forward by accurately detecting the contribution of low-effect genetic variants to human diseases.

[1]  X. Shu,et al.  Integrating transcription factor occupancy with transcriptome-wide association analysis identifies susceptibility genes in human cancers , 2022, Nature Communications.

[2]  F. Hormozdiari,et al.  Combining SNP-to-gene linking strategies to identify disease genes and assess disease omnigenicity , 2022, Nature Genetics.

[3]  Shizhong Xu,et al.  Estimation of genetic variance contributed by a quantitative trait locus: correcting the bias associated with significance tests. , 2021, Genetics.

[4]  Shizhong Xu,et al.  Genomic selection: A breakthrough technology in rice breeding , 2021, The Crop Journal.

[5]  K. D. Sørensen,et al.  An integrative multi-omics analysis to identify candidate DNA methylation biomarkers related to prostate cancer risk , 2020, Nature Communications.

[6]  S. Choi,et al.  Tutorial: a guide to performing polygenic risk score analyses , 2020, Nature Protocols.

[7]  Qing Li,et al.  Power analysis of transcriptome-wide association study: implications for practical protocol choice , 2020, bioRxiv.

[8]  Qing Li,et al.  kTWAS: Integrating kernel-machine with transcriptome-wide association studies improves statistical power and reveals novel genes , 2020, bioRxiv.

[9]  A. Price,et al.  Identifying loci with different allele frequencies among cases of eight psychiatric disorders using CC-GWAS , 2020, Nature Genetics.

[10]  Matthew R. Robinson,et al.  Accurate, scalable and integrative haplotype estimation , 2019, Nature Communications.

[11]  O. Delaneau,et al.  Genotype imputation using the Positional Burrows Wheeler Transform , 2019, bioRxiv.

[12]  N. Patterson,et al.  Extreme Polygenicity of Complex Traits Is Explained by Negative Selection. , 2019, American journal of human genetics.

[13]  P. Visscher,et al.  From R.A. Fisher’s 1918 Paper to GWAS a Century Later , 2019, Genetics.

[14]  Xinghua Shi,et al.  OCMA: Fast, Memory-Efficient Factorization of Prohibitively Large Relationship Matrices , 2018, G3: Genes, Genomes, Genetics.

[15]  Timothy Shin Heng Mak,et al.  Tutorial: a guide to performing polygenic risk score analyses , 2018, bioRxiv.

[16]  Po-Ru Loh,et al.  Functional architecture of low-frequency variants highlights strength of negative selection across coding and noncoding annotations , 2018, Nature Genetics.

[17]  Hongyu Zhao,et al.  A statistical framework for cross-tissue transcriptome-wide association analysis , 2018, Nature Genetics.

[18]  G. de los Campos,et al.  Genomic Selection in Plant Breeding: Methods, Models, and Perspectives. , 2017, Trends in plant science.

[19]  Shizhong Xu Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction , 2017, G3: Genes, Genomes, Genetics.

[20]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[21]  Doug Speed,et al.  Re-evaluation of SNP heritability in complex human traits , 2016, Nature Genetics.

[22]  T. Lehtimäki,et al.  Integrative approaches for large-scale transcriptome-wide association studies , 2015, Nature Genetics.

[23]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[24]  G. Kempermann Faculty Opinions recommendation of Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. , 2015 .

[25]  Núria Queralt-Rosinach,et al.  DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes , 2015, Database J. Biol. Databases Curation.

[26]  D. Balding,et al.  Relatedness in the post-genomic era: is it still useful? , 2014, Nature Reviews Genetics.

[27]  Jing Wang,et al.  CrossMap: a versatile tool for coordinate conversion between genome assemblies , 2014, Bioinform..

[28]  M. Daly,et al.  Searching for missing heritability: Designing rare variant association studies , 2014, Proceedings of the National Academy of Sciences.

[29]  Shizhong Xu,et al.  Genetic Mapping and Genomic Selection Using Recombination Breakpoint Data , 2013, Genetics.

[30]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[31]  Bjarni J. Vilhjálmsson,et al.  JAWAMix5: an out-of-core HDF5-based java implementation of whole-genome association studies using mixed models , 2013, Bioinform..

[32]  Doug Speed,et al.  Improved heritability estimation from genome-wide SNPs. , 2012, American journal of human genetics.

[33]  Xiang Zhou,et al.  Polygenic Modeling with Bayesian Sparse Linear Mixed Models , 2012, PLoS genetics.

[34]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.

[35]  R. Durbin,et al.  Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses , 2012, Nature Protocols.

[36]  E. Lander,et al.  The mystery of missing heritability: Genetic interactions create phantom heritability , 2012, Proceedings of the National Academy of Sciences.

[37]  P. Visscher,et al.  GCTA: a tool for genome-wide complex trait analysis. , 2011, American journal of human genetics.

[38]  P. Visscher,et al.  Reconciling the analysis of IBD and IBS in complex trait studies , 2010, Nature Reviews Genetics.

[39]  David Heckerman,et al.  Correction for hidden confounders in the genetic analysis of gene expression , 2010, Proceedings of the National Academy of Sciences.

[40]  Alkes L. Price,et al.  New approaches to population stratification in genome-wide association studies , 2010, Nature Reviews Genetics.

[41]  Zhiwu Zhang,et al.  Mixed linear model approach adapted for genome-wide association studies , 2010, Nature Genetics.

[42]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[43]  Stephen S. Rich,et al.  Genome-Wide Association Scan for Diabetic Nephropathy Susceptibility Genes in Type 1 Diabetes , 2009, Diabetes.

[44]  C. Hoggart,et al.  Genome-wide association analysis of metabolic traits in a birth cohort from a founder population , 2008, Nature Genetics.

[45]  Chun Jimmie Ye,et al.  Accurate Discovery of Expression Quantitative Trait Loci Under Confounding From Spurious and Genuine Regulatory Hotspots , 2008, Genetics.

[46]  D. Heckerman,et al.  Efficient Control of Population Structure in Model Organism Association Mapping , 2008, Genetics.

[47]  M. Goddard,et al.  Genomic selection. , 2007, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[48]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[49]  M. McMullen,et al.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness , 2006, Nature Genetics.

[50]  S. Gabriel,et al.  Assessing the impact of population stratification on genetic association studies , 2004, Nature Genetics.

[51]  L. Cardon,et al.  The complex interplay among factors that influence allelic association , 2004, Nature Reviews Genetics.

[52]  Lon R. Cardon,et al.  The complex interplay among factors that influence allelic association , 2004, Nature Reviews Genetics.

[53]  Shizhong Xu Estimating polygenic effects using markers of the entire genome. , 2003, Genetics.

[54]  F. Rousset,et al.  Inbreeding and relatedness coefficients: what do they measure? , 2002, Heredity.

[55]  M S McPeek,et al.  The genetic dissection of complex traits in a founder population. , 2001, American journal of human genetics.

[56]  Kevin Fiedler,et al.  Likelihood Bayesian And Mcmc Methods In Quantitative Genetics , 2016 .

[57]  R. Fisher XV.—The Correlation between Relatives on the Supposition of Mendelian Inheritance. , 1919, Transactions of the Royal Society of Edinburgh.

[58]  R. Fisher,et al.  Xv.-the Correlation between Relatives on the Supposition of Mendelian Inherit- (with Four Figures in Text.) , 2022 .