Statistical tests for detecting variance effects in quantitative trait studies

Motivation: Identifying variants, both discrete and continuous, that are associated with quantitative traits, or QTs, is the primary focus of quantitative genetics. Most current methods are limited to identifying mean effects, or associations between genotype or covariates and the mean value of a quantitative trait. It is possible, however, that a variant may affect the variance of the quantitative trait in lieu of, or in addition to, affecting the trait mean. Here, we develop a general methodology to identify covariates with variance effects on a quantitative trait using a Bayesian heteroskedastic linear regression model (BTH). We compare BTH with existing methods to detect variance effects across a large range of simulations drawn from scenarios common to the analysis of quantitative traits. Results: We find that BTH and a double generalized linear model (dglm) outperform classical tests used for detecting variance effects in recent genomic studies. We show BTH and dglm are less likely to generate spurious discoveries through simulations and application to identifying methylation variance QTs and expression variance QTs. We identify four variance effects of sex in the Cardiovascular and Pharmacogenetics study. Our work is the first to offer a comprehensive view of variance identifying methodology. We identify shortcomings in previously used methodology and provide a more conservative and robust alternative. We extend variance effect analysis to a wide array of covariates that enables a new statistical dimension in the study of sex and age specific quantitative trait effects. Availability and implementation: https://github.com/b2du/bth. Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Taylor J. Maxwell,et al.  A Versatile Omnibus Test for Detecting Mean and Variance Heterogeneity , 2014, Genetic epidemiology.

[2]  Mats E. Pettersson,et al.  Inheritance Beyond Plain Heritability: Variance-Controlling Genes in Arabidopsis thaliana , 2012, PLoS genetics.

[3]  Morton B. Brown,et al.  The Small Sample Behavior of Some Statistics Which Test the Equality of Several Means , 1974 .

[4]  Claudio J. Verzilli,et al.  An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People , 2012, Science.

[5]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[6]  J. Marchini,et al.  Genotype imputation for genome-wide association studies , 2010, Nature Reviews Genetics.

[7]  Garth A. Gibson,et al.  Canalization in evolutionary genetics: a stabilizing theory? , 2000, BioEssays : news and reviews in molecular, cellular and developmental biology.

[8]  Richard Durbin,et al.  Genetic interactions affecting human gene expression identified by variance association mapping , 2014, eLife.

[9]  Taylor J. Maxwell,et al.  A Family‐Based Joint Test for Mean and Variance Heterogeneity for Quantitative Traits , 2015, Annals of human genetics.

[10]  Paul M. Ridker,et al.  On the Use of Variance per Genotype as a Tool to Identify Quantitative Trait Interaction Effects: A Report from the Women's Genome Health Study , 2010, PLoS genetics.

[11]  W. Valdar,et al.  Detecting Major Genetic Loci Controlling Phenotypic Variability in Experimental Crosses , 2011, Genetics.

[12]  W. Linke,et al.  Protein phosphatase 5 regulates titin phosphorylation and function at a sarcomere-associated mechanosensor complex in cardiomyocytes , 2018, Nature Communications.

[13]  Quin F. Wills,et al.  Single-cell gene expression analysis reveals genetic associations masked in whole-tissue experiments , 2013, Nature Biotechnology.

[14]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[15]  R. Doerge,et al.  Statistical Design and Analysis of RNA Sequencing Data , 2010, Genetics.

[16]  J. Qin,et al.  Emergence and subsequent functional specialization of kindlins during evolution of cell adhesiveness , 2015, Molecular biology of the cell.

[17]  M. Stephens,et al.  Bayesian statistical methods for genetic association studies , 2009, Nature Reviews Genetics.

[18]  M. Nachman,et al.  The genetic basis of adaptive melanism in pocket mice , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  R. Mott,et al.  Genetic Architecture of Flowering-Time Variation in Arabidopsis thaliana , 2011, Genetics.

[20]  Krishna R. Kalari,et al.  Radiation pharmacogenomics: a genome-wide association approach to identify radiation response biomarkers using human lymphoblastoid cell lines. , 2010, Genome research.

[21]  Christopher D. Brown,et al.  A statin-dependent QTL for GATM expression is associated with statin-induced myopathy , 2013, Nature.

[22]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[23]  R. Redon,et al.  Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes , 2007, Science.

[24]  B. Han,et al.  Genome-based exome sequencing analysis identifies GYG1, DIS3L and DDRGK1 are associated with myocardial infarction in Koreans , 2017, Journal of Genetics.

[25]  W. G. Hill,et al.  Genetic analysis of environmental variation. , 2010, Genetics research.

[26]  T. Marquès-Bonet,et al.  DNA methylation contributes to natural human variation , 2013, Genome research.

[27]  Radu V. Craiu,et al.  Stratified false discovery control for large‐scale hypothesis testing with application to genome‐wide association studies , 2006, Genetic epidemiology.

[28]  M. McCarthy,et al.  Replication of Genome-Wide Association Signals in UK Samples Reveals Risk Loci for Type 2 Diabetes , 2007, Science.

[29]  Cornelia M van Duijn,et al.  An R package "VariABEL" for genome-wide searching of potentially interacting loci by testing genotypic variance heterogeneity , 2012, BMC Genetics.

[30]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[31]  Joel Eriksson,et al.  FTO genotype is associated with phenotypic variability of body mass index , 2012, Nature.

[32]  Hoifung Poon,et al.  Classification of common human diseases derived from shared genetic and environmental determinants , 2017, Nature Genetics.

[33]  B. Schultz Levene's Test for Relative Variation , 1985 .

[34]  Joseph K. Pickrell,et al.  Understanding mechanisms underlying human gene expression variation with RNA sequencing , 2010, Nature.

[35]  Krishna R. Kalari,et al.  Copy number variation and cytidine analogue cytotoxicity: A genome-wide association approach , 2010, BMC Genomics.

[36]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[37]  Benjamin L de Bivort,et al.  Behavioral idiosyncrasy reveals genetic control of phenotypic variability , 2014, Proceedings of the National Academy of Sciences.

[38]  S. Lindquist,et al.  Hsp90 as a capacitor of phenotypic variation , 2002, Nature.

[39]  J. François,et al.  Cell-to-Cell Stochastic Variation in Gene Expression Is a Complex Genetic Trait , 2008, PLoS genetics.

[40]  Håvard Rue,et al.  Direct fitting of dynamic models using integrated nested Laplace approximations - INLA , 2012, Comput. Stat. Data Anal..

[41]  M. Bartlett Properties of Sufficiency and Statistical Tests , 1992 .

[42]  A. McCulloch,et al.  An FHL1-containing complex within the cardiomyocyte sarcomere mediates hypertrophic biomechanical stress responses in mice. , 2008, The Journal of clinical investigation.

[43]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[44]  P. Wittkopp,et al.  Selection on noise constrains variation in a eukaryotic promoter , 2015, Nature.

[45]  A. O'Hagan,et al.  On Outlier Rejection Phenomena in Bayes Inference , 1979 .

[46]  Lei Sun,et al.  A generalized Levene's scale test for variance heterogeneity in the presence of sample correlation and group uncertainty , 2016, Biometrics.

[47]  M. Lascoux,et al.  Ecological genomics of local adaptation , 2013, Nature Reviews Genetics.

[48]  D. Koller,et al.  Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals , 2013, Genome research.

[49]  Ariel Rodriguez,et al.  Cellular Human CLE/C14orf166 Protein Interacts with Influenza Virus Polymerase and Is Required for Viral Replication , 2011, Journal of Virology.