GxE PRS: Genotype-environment interaction in polygenic risk score models for quantitative and binary traits

The use of polygenic risk score (PRS) models has transformed the field of genetics by enabling the prediction of complex traits and diseases based on an individual’s genetic profile. However, the impact of genotype-environment interaction (GxE) on the performance and applicability of PRS models remains a crucial aspect to be explored. Currently, existing GxE PRS models are often inappropriately used, which can result in inflated type 1 error rates and compromised results. In this study, we propose a novel GxE PRS model that correctly incorporates the GxE component to analyze complex traits and diseases. Through extensive simulations, we demonstrate that our proposed model outperforms existing models in terms of controlling type 1 error rates and enhancing statistical power. Furthermore, we apply the proposed model to real data, and report significant GxE effects. Specifically, we highlight the impact of our model on both quantitative and binary traits. For quantitative traits, we uncover the GxE modulation of genetic effects on body mass index (BMI) by alcohol intake frequency (ALC). In the case of binary traits, we identify the GxE modulation of genetic effects on hypertension (HYP) by waist-to-hip ratio (WHR). These findings underscore the importance of employing a robust model that effectively controls type 1 error rates, thus preventing the occurrence of spurious GxE signals. To facilitate the implementation of our approach, we have developed an innovative R software package called GxE PRS, specifically designed to detect and estimate GxE effects. Overall, our study highlights the importance of accurate GxE modeling and its implications for genetic risk prediction, while providing a practical tool to support further research in this area.

[1]  Hong Zhang,et al.  Pharmacogenomics polygenic risk score for drug response prediction using PRS-PGx methods , 2022, Nature Communications.

[2]  Ji-Hyung Shin,et al.  GxEsum: a novel approach to estimate the phenotypic variance explained by genome-wide GxE interaction based on GWAS summary statistics for biobank-scale data , 2020, bioRxiv.

[3]  M. Inouye,et al.  Towards clinical utility of polygenic risk scores. , 2019, Human molecular genetics.

[4]  E. Hyppönen,et al.  Whole‐Genome Approach Discovers Novel Genetic and Nongenetic Variance Components Modulated by Lifestyle for Cardiovascular Health , 2019, bioRxiv.

[5]  M. O’Donovan,et al.  Examining the independent and joint effects of molecular genetic liability and environmental exposures in schizophrenia: results from the EUGEI study , 2019, World psychiatry : official journal of the World Psychiatric Association.

[6]  A. McIntosh,et al.  Genome-wide by environment interaction studies of depressive symptoms and psychosocial stress in UK Biobank and Generation Scotland , 2019, Translational Psychiatry.

[7]  Ryan Sun,et al.  Testing for gene–environment interaction under exposure misspecification , 2018, Biometrics.

[8]  John P. Rice,et al.  Does Childhood Trauma Moderate Polygenic Risk for Depression? A Meta-analysis of 5765 Subjects From the Psychiatric Genomics Consortium , 2017, Biological Psychiatry.

[9]  M. Rask-Andersen,et al.  Gene-environment interaction study for BMI reveals interactions between genetic factors and physical activity, alcohol consumption and socioeconomic status , 2017, PLoS genetics.

[10]  A. Metspalu,et al.  Hidden heritability due to heterogeneity across seven populations , 2017, Nature Human Behaviour.

[11]  P. Donnelly,et al.  Genome-wide genetic data on ~500,000 UK Biobank participants , 2017, bioRxiv.

[12]  Jonathan P. Beauchamp,et al.  Genome-wide association study identifies 74 loci associated with educational attainment , 2016, Nature.

[13]  Seung Hwan Lee,et al.  MTG2: an efficient algorithm for multivariate linear mixed model analysis based on genomic information , 2015, bioRxiv.

[14]  M. Daly,et al.  An Atlas of Genetic Correlations across Human Diseases and Traits , 2015, Nature Genetics.

[15]  P. Hall,et al.  Breast cancer genetic risk profile is differentially associated with interval and screen-detected breast cancers. , 2015, Annals of oncology : official journal of the European Society for Medical Oncology.

[16]  Theodore Eliades,et al.  Factorial designs: an overview with applications to orthodontic clinical trials. , 2014, European journal of orthodontics.

[17]  U. Nöthlings,et al.  Genome-wide investigation of gene–environment interactions in colorectal cancer , 2013, Human Genetics.

[18]  T. Spector,et al.  Understanding coronary artery disease using twin studies , 2012, Heart.

[19]  L. Liang,et al.  Genome-wide association analyses of esophageal squamous cell carcinoma in Chinese identify multiple susceptibility loci and gene-environment interactions , 2012, Nature Genetics.

[20]  Claude Bouchard,et al.  A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance , 2012, Nature Genetics.

[21]  M. Marazita,et al.  Genome-wide Association Studies , 2012, Journal of dental research.

[22]  P. Wolf,et al.  Relation of obesity to cognitive function: importance of central obesity and synergistic influence of concomitant hypertension. The Framingham Heart Study. , 2007, Current Alzheimer research.

[23]  J. Erdmann,et al.  Genetic Factors for Overweight and CAD , 2006, Herz Kardiovaskuläre Erkrankungen.

[24]  Danielle Posthuma,et al.  Heritability and stability of resting blood pressure. , 2005, Twin research and human genetics : the official journal of the International Society for Twin Studies.

[25]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[26]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[27]  L. Cardon,et al.  Clustering of hypertension, diabetes, and obesity in adult male twins: same genes or same environments? , 1994, American journal of human genetics.

[28]  Stanislav KolenikovGustavo Angeles The Use of Discrete Data in PCA: Theory, Simulations, and Applications to Socioeconomic Indices , 2004 .