netCRS: Network-based comorbidity risk score for prediction of myocardial infarction using biobank-scaled PheWAS data

The polygenic risk score (PRS) can help to identify individuals’ genetic susceptibility for various diseases by combining patient genetic profiles and identified single-nucleotide polymorphisms (SNPs) from genome-wide association studies. Although multiple diseases will usually afflict patients at once or in succession, conventional PRSs fail to consider genetic relationships across multiple diseases. Even multi-trait PRSs, which take into account genetic effects for more than one disease at a time, fail to consider a sufficient number of phenotypes to accurately reflect the state of disease comorbidity in a patient, or are biased in terms of the traits that are selected. Thus, we developed novel network-based comorbidity risk scores to quantify associations among multiple phenotypes from phenome-wide association studies (PheWAS). We first constructed a disease-SNP heterogeneous multi-layered network (DS-Net), which consists of a disease network (disease-layer) and SNP network (SNP-layer). The disease-layer describes the population-level interactome from PheWAS data. The SNP-layer was constructed according to linkage disequilibrium. Both layers were attached to transform the information from a population-level interactome to individual-level inferences. Then, graph-based semi-supervised learning was applied to predict possible comorbidity scores on disease-layer for each subject. The SNP-layer serves as receiving individual genotyping data in the scoring process, and the disease-layer serves as the propagated output for an individual’s multiple disease comorbidity scores. The possible comorbidity scores were combined by logistic regression, and it is denoted as netCRS. The DS-Net was constructed from UK Biobank PheWAS data, and the individual genetic profiles were collected from the Penn Medicine Biobank. As a proof-of-concept study, myocardial infarction (MI) was selected to compare netCRS with the PRS with pruning and thresholding (PRS-PT). The combined model (netCRS + PRS-PT + covariates) achieved an AUC improvement of 6.26% compared to the (PRS-PT + covariates) model. In terms of risk stratification, the combined model was able to capture the risk of MI up to approximately eight-fold higher than that of the low-risk group. The netCRS and PRS-PT complement each other in predicting high-risk groups of patients with MI. We expect that using these risk prediction models will allow for the development of prevention strategies and reduction of MI morbidity and mortality.

[1]  Yanwen Chong,et al.  Graph-based semi-supervised learning: A review , 2020, Neurocomputing.

[2]  M. García-Closas,et al.  Combined Utility of 25 Disease and Risk Factor Polygenic Risk Scores for Stratifying Risk of All-Cause Mortality. , 2020, American journal of human genetics.

[3]  P. Elliott,et al.  Predictive Accuracy of a Polygenic Risk Score-Enhanced Prediction Model vs a Clinical Risk Score for Coronary Artery Disease. , 2020, JAMA.

[4]  Lei Xie,et al.  Heterogeneous Multi-Layered Network Model for Omics Data Integration and Analysis , 2020, Frontiers in Genetics.

[5]  Ju Han Kim,et al.  The translational network for metabolic disease – from protein interaction to disease co-occurrence , 2019, BMC Bioinformatics.

[6]  D. Rader,et al.  Polygenic Risk Scores for Cardio-renal-metabolic Diseases in the Penn Medicine Biobank , 2019, bioRxiv.

[7]  Nicole A. Restrepo,et al.  Penetrance and Pleiotropy of Polygenic Risk Scores for Schizophrenia in 106,160 Patients Across Four Health Care Systems. , 2019, The American journal of psychiatry.

[8]  M. Feldman,et al.  Analysis of polygenic risk score usage and performance in diverse human populations , 2019, Nature Communications.

[9]  P. O’Reilly,et al.  PRSice-2: Polygenic Risk Score software for biobank-scale data , 2019, GigaScience.

[10]  E. Walton,et al.  A cross-disorder PRS-pheWAS of 5 major psychiatric disorders in UK Biobank , 2019, bioRxiv.

[11]  Alicia R. Martin,et al.  Clinical use of current polygenic risk scores may exacerbate health disparities , 2019, Nature Genetics.

[12]  Jason E. Miller,et al.  Human-Disease Phenotype Map Derived from PheWAS across 38,682 Individuals , 2018, American journal of human genetics.

[13]  Joshua C. Denny,et al.  Developing and Evaluating Mappings of ICD-10 and ICD-10-CM Codes to Phecodes , 2018, bioRxiv.

[14]  Timothy Shin Heng Mak,et al.  Tutorial: a guide to performing polygenic risk score analyses , 2018, bioRxiv.

[15]  H. Woodrow,et al.  : A Review of the , 2018 .

[16]  Mary E. Haas,et al.  Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations , 2018, Nature Genetics.

[17]  Stephanie E. Moser,et al.  Association of Polygenic Risk Scores for Multiple Cancers in a Phenome-wide Study: Results from The Michigan Genomics Initiative , 2017, bioRxiv.

[18]  Lars G Fritsche,et al.  Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies , 2017, Nature Genetics.

[19]  Sebastian M. Armasu,et al.  A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease , 2015, Nature Genetics.

[20]  Melissa A. Basford,et al.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data , 2013, Nature Biotechnology.

[21]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[22]  B. Starfield,et al.  Defining Comorbidity: Implications for Understanding Health and Health Services , 2009, The Annals of Family Medicine.

[23]  Krin A. Kay,et al.  The implications of human metabolic network topology for disease comorbidity , 2008, Proceedings of the National Academy of Sciences.

[24]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[25]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.