A rapid epistatic mixed-model association analysis by linear retransformations of genomic estimated values

Abstract Motivation Epistasis provides a feasible way for probing potential genetic mechanism of complex traits. However, time-consuming computation challenges successful detection of interaction in practice, especially when linear mixed model (LMM) is used to control type I error in the presence of population structure and cryptic relatedness. Results A rapid epistatic mixed-model association analysis (REMMA) method was developed to overcome computational limitation. This method first estimates individuals’ epistatic effects by an extended genomic best linear unbiased prediction (EG-BLUP) model with additive and epistatic kinship matrix, then pairwise interaction effects are obtained by linear retransformations of individuals’ epistatic effects. Simulation studies showed that REMMA could control type I error and increase statistical power in detecting epistatic QTNs in comparison with existing LMM-based FaST-LMM. We applied REMMA to two real datasets, a mouse dataset and the Wellcome Trust Case Control Consortium (WTCCC) data. Application to the mouse data further confirmed the performance of REMMA in controlling type I error. For the WTCCC data, we found most epistatic QTNs for type 1 diabetes (T1D) located in a major histocompatibility complex (MHC) region, from which a large interacting network with 12 hub genes (interacting with ten or more genes) was established. Availability and implementation Our REMMA method can be freely accessed at https://github.com/chaoning/REMMA. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Joseph Beyene,et al.  Genetic Analysis Workshop 18: Methods and strategies for analyzing human sequence and phenotype data in members of extended pedigrees , 2014, BMC Proceedings.

[2]  Momiao Xiong,et al.  Epistasis analysis for quantitative traits by functional regression model , 2014, Genome research.

[3]  W. G. Hill,et al.  Influence of Gene Interaction on Complex Trait Variation with Multilocus Models , 2014, Genetics.

[4]  Doug Speed,et al.  MultiBLUP: improved SNP-based prediction for complex traits , 2014, Genome research.

[5]  Peer Bork,et al.  Systematic identification of novel protein domain families associated with nuclear functions. , 2002, Genome research.

[6]  Shizhong Xu Estimating polygenic effects using markers of the entire genome. , 2003, Genetics.

[7]  Rachael Stolzenberg-Solomon,et al.  Variants Associated with Susceptibility to Pancreatic Cancer and Melanoma Do Not Reciprocally Affect Risk , 2014, Cancer Epidemiology, Biomarkers & Prevention.

[8]  C. R. Henderson Best Linear Unbiased Prediction of Nonadditive Genetic Merits in Noninbred Populations , 1985 .

[9]  M. Lund,et al.  Estimating Additive and Non-Additive Genetic Variances and Predicting Genetic Merits Using Genome-Wide Dense Single Nucleotide Polymorphism Markers , 2012, PloS one.

[10]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[11]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[12]  Ioannis Xenarios,et al.  FastEpistasis: a high performance computing solution for quantitative trait epistasis , 2010, Bioinform..

[13]  Jochen C Reif,et al.  Modeling Epistasis in Genomic Selection , 2015, Genetics.

[14]  Zhiwu Zhang,et al.  Mixed linear model approach adapted for genome-wide association studies , 2010, Nature Genetics.

[15]  Willem Kruijer,et al.  Marker-Based Estimation of Heritability in Immortal Populations , 2014, Genetics.

[16]  Leonid Kruglyak,et al.  Genetic interactions contribute less than additive effects to quantitative trait variation in yeast , 2015, Nature Communications.

[17]  D. Garrick,et al.  Technical note: Derivation of equivalent computing algorithms for genomic predictions and reliabilities of animal merit. , 2009, Journal of dairy science.

[18]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.

[19]  D. Heckerman,et al.  Efficient Control of Population Structure in Model Organism Association Mapping , 2008, Genetics.

[20]  Zhiwu Zhang,et al.  Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies , 2016, PLoS genetics.

[21]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[22]  Jason H. Moore,et al.  Why epistasis is important for tackling complex human disease genetics , 2014, Genome Medicine.

[23]  Shizhong Xu Mapping Quantitative Trait Loci by Controlling Polygenic Background Effects , 2013, Genetics.

[24]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[25]  Ying Liu,et al.  FaST linear mixed models for genome-wide association studies , 2011, Nature Methods.

[26]  J. Cheverud,et al.  Mapping the Epistatic Network Underlying Murine Reproductive Fatpad Variation , 2011, Genetics.

[27]  Oswaldo Trelles,et al.  Review: High-performance computing to detect epistasis in genome scale data sets , 2016, Briefings Bioinform..

[28]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[29]  L. Penrose,et al.  THE CORRELATION BETWEEN RELATIVES ON THE SUPPOSITION OF MENDELIAN INHERITANCE , 2022 .

[30]  Moudud Alam,et al.  A Novel Generalized Ridge Regression Method for Quantitative Genetics , 2013, Genetics.

[31]  M. Lund,et al.  Genomic prediction when some animals are not genotyped , 2010, Genetics Selection Evolution.

[32]  M. McMullen,et al.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness , 2006, Nature Genetics.

[33]  C. R. Henderson,et al.  Best linear unbiased estimation and prediction under a selection model. , 1975, Biometrics.

[34]  I Misztal,et al.  A relationship matrix including full pedigree and genomic information. , 2009, Journal of dairy science.

[35]  Daniel Gianola,et al.  Additive Genetic Variability and the Bayesian Alphabet , 2009, Genetics.

[36]  Leonid Kruglyak,et al.  Accounting for genetic interactions improves modeling of individual quantitative trait phenotypes in yeast , 2016, Nature Genetics.

[37]  Qiang Yang,et al.  BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies , 2010, American journal of human genetics.

[38]  P. Visscher,et al.  Advantages and pitfalls in the application of mixed-model association methods , 2014, Nature Genetics.

[39]  P. VanRaden,et al.  Efficient methods to compute genomic predictions. , 2008, Journal of dairy science.