On cross-ancestry cancer polygenic risk scores

Polygenic risk scores (PRS) can provide useful information for personalized risk stratification and disease risk assessment, especially when combined with non-genetic risk factors. However, their construction depends on the availability of summary statistics from genome-wide association studies (GWAS) independent from the target sample. For best compatibility, it was reported that GWAS and the target sample should match in terms of ancestries. Yet, GWAS, especially in the field of cancer, often lack diversity and are predominated by European ancestry. This bias is a limiting factor in PRS research. By using electronic health records and genetic data from the UK Biobank, we contrast the utility of breast and prostate cancer PRS derived from external European-ancestry-based GWAS across African, East Asian, European, and South Asian ancestry groups. We highlight differences in the PRS distributions of these groups that are amplified when PRS methods condense hundreds of thousands of variants into a single score. While European-GWAS-derived PRS were not directly transferrable across ancestries on an absolute scale, we establish their predictive potential when considering them separately within each group. For example, the top 10% of the breast cancer PRS distributions within each ancestry group each revealed significant enrichments of breast cancer cases compared to the bottom 90% (odds ratio of 2.81 [95%CI: 2.69,2.93] in European, 2.88 [1.85, 4.48] in African, 2.60 [1.25, 5.40] in East Asian, and 2.33 [1.55, 3.51] in South Asian individuals). Our findings highlight a compromise solution for PRS research to compensate for the lack of diversity in well-powered European GWAS efforts while recruitment of diverse participants in the field catches up.

[1]  Lars G Fritsche,et al.  Cancer PRSweb: An Online Repository with Polygenic Risk Scores for Major Cancer Traits and Their Evaluation in Two Independent Biobanks , 2020, American journal of human genetics.

[2]  Xiang Zhou,et al.  Accurate and Scalable Construction of Polygenic Scores in Large Biobank Data Sets. , 2020, American journal of human genetics.

[3]  Seunggeun Lee,et al.  Fast and robust ancestry prediction using principal component analysis. , 2020, Bioinformatics.

[4]  Nicola Sartori,et al.  Mean and median bias reduction in generalized linear models , 2018, Stat. Comput..

[5]  Alison P. Klein,et al.  Assessment of polygenic architecture and risk prediction based on common variants across fourteen cancers , 2019, bioRxiv.

[6]  M. Feldman,et al.  Analysis of polygenic risk score usage and performance in diverse human populations , 2019, Nature Communications.

[7]  Yang Ni,et al.  Polygenic prediction via Bayesian regression and continuous shrinkage priors , 2018, Nature Communications.

[8]  Scott M. Williams,et al.  The Missing Diversity in Human Genetic Studies , 2019, Cell.

[9]  Michael D. Edge,et al.  Interpreting polygenic scores, polygenic adaptation, and human phenotypic differences , 2018, Evolution, medicine, and public health.

[10]  Kelsey E. Grinde,et al.  Generalizing polygenic risk scores from Europeans to Hispanics/Latinos , 2018, Genetic epidemiology.

[11]  K. D. Sørensen,et al.  Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci , 2018, Nature Genetics.

[12]  E. Topol,et al.  The personal and clinical utility of polygenic risk scores , 2018, Nature Reviews Genetics.

[13]  D. Curtis Polygenic risk score for schizophrenia is more strongly associated with ancestry than with schizophrenia , 2018, bioRxiv.

[14]  Po-Ru Loh,et al.  Multi-ethnic polygenic risk scores improve risk prediction in diverse populations , 2016, bioRxiv.

[15]  Gary D Bader,et al.  Association analysis identifies 65 new breast cancer risk loci , 2017, Nature.

[16]  Themistocles L Assimes,et al.  Leveraging Multi-ethnic Evidence for Risk Assessment of Quantitative Traits in Minority Populations. , 2017, American journal of human genetics.

[17]  P. Donnelly,et al.  Genome-wide genetic data on ~500,000 UK Biobank participants , 2017, bioRxiv.

[18]  Jaak Vilo,et al.  Comparing distributions of polygenic risk scores of type 2 diabetes and coronary heart disease within different populations , 2017, PloS one.

[19]  Christopher R. Gignoux,et al.  Human demographic history impacts genetic risk prediction across diverse populations , 2016, bioRxiv.

[20]  T. Rebbeck,et al.  Cancer Genomics: Diversity and Disparity Across Ethnicity and Geography. , 2016, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[21]  G. Parmigiani,et al.  Familial Risk and Heritability of Cancer Among Twins in Nordic Countries. , 2016, JAMA.

[22]  Patrick Neven,et al.  Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer , 2015 .

[23]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[24]  Clara Diaz,et al.  Identifying large sets of unrelated individuals and unrelated markers , 2014, Source Code for Biology and Medicine.

[25]  Nilanjan Chatterjee,et al.  Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies , 2013, Nature Genetics.

[26]  F. Dudbridge Power and Predictive Accuracy of Polygenic Risk Scores , 2013, PLoS genetics.

[27]  D. Belsky,et al.  Development and Evaluation of a Genetic Risk Score for Obesity , 2013, Biodemography and social biology.

[28]  Josyf Mychaleckyj,et al.  Robust relationship inference in genome-wide association studies , 2010, Bioinform..

[29]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[30]  Holly Janes,et al.  Adjusting for covariate effects on classification accuracy using the covariate-adjusted receiver operating characteristic curve. , 2009, Biometrika.

[31]  J. Ioannidis,et al.  Strengthening the reporting of genetic association studies (STREGA): an extension of the STROBE statement , 2009, European Journal of Epidemiology.