Biobank-wide association scan identifies risk factors for late-onset Alzheimer’s disease and endophenotypes

Dense genotype data and thousands of phenotypes from large biobanks, coupled with increasingly accessible summary association statistics from genome-wide association studies (GWAS), provide great opportunities to dissect the complex relationships among human traits and diseases. We introduce BADGERS, a powerful method to perform polygenic score-based biobank-wide scans for disease-trait associations. Compared to traditional regression approaches, BADGERS uses GWAS summary statistics as input and does not require multiple traits to be measured on the same cohort. We applied BADGERS to two independent datasets for Alzheimer’s disease (AD; N=61,212). Among the polygenic risk scores (PRS) for 1,738 traits in the UK Biobank, we identified 48 significant trait PRSs associated with AD after adjusting for multiple testing. Family history, high cholesterol, and numerous traits related to intelligence and education showed strong and independent associations with AD. Further, we identified 41 significant PRSs associated with AD endophenotypes. While family history and high cholesterol were strongly associated with neuropathologies and cognitively-defined AD subgroups, only intelligence and education-related traits predicted pre-clinical cognitive phenotypes. These results provide novel insights into the distinct biological processes underlying various risk factors for AD.

[1]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[2]  Sudha Seshadri,et al.  Genome-wide analysis of genetic loci associated with Alzheimer disease. , 2010, JAMA.

[3]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[4]  Hongyu Zhao,et al.  Leveraging functional annotations in genetic risk prediction for human complex diseases , 2016, bioRxiv.

[5]  J. Gallacher,et al.  Meta-analysis of genetic association with diagnosed Alzheimer’s disease identifies novel risk loci and implicates Abeta, Tau, immunity and lipid processing , 2018, bioRxiv.

[6]  G. Davey Smith,et al.  Genetic epidemiology and Mendelian randomization for informing disease therapeutics: Conceptual and methodological challenges , 2017, bioRxiv.

[7]  C. Jack,et al.  Tracking pathophysiological processes in Alzheimer's disease: an updated hypothetical model of dynamic biomarkers , 2013, The Lancet Neurology.

[8]  A. Wimo,et al.  The global prevalence of dementia: A systematic review and metaanalysis , 2013, Alzheimer's & Dementia.

[9]  Nick C Fox,et al.  Letter abstract - Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer's Disease , 2009 .

[10]  T. Lehtimäki,et al.  Integrative approaches for large-scale transcriptome-wide association studies , 2015, Nature Genetics.

[11]  R. Miles,et al.  CYP46A1 inhibition, brain cholesterol accumulation and neurodegeneration pave the way for Alzheimer's disease. , 2015, Brain : a journal of neurology.

[12]  Alan M. Kwong,et al.  Next-generation genotype imputation service and methods , 2016, Nature Genetics.

[13]  Sterling C. Johnson,et al.  Beta-amyloid and cognitive decline in late middle age: Findings from the Wisconsin Registry for Alzheimer's Prevention study , 2016, Alzheimer's & Dementia.

[14]  A. Saykin,et al.  Incidence of cognitively defined late-onset Alzheimer's dementia subgroups from a prospective cohort study , 2017, Alzheimer's & Dementia.

[15]  J. Dichgans,et al.  Cholesterol and Alzheimer’s disease , 2001, Neurology.

[16]  Robert M. Maier,et al.  Causal associations between risk factors and common diseases inferred from GWAS summary data , 2017, Nature Communications.

[17]  A. Butterworth,et al.  Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data , 2013, Genetic epidemiology.

[18]  Y. Stern Cognitive reserve in ageing and Alzheimer's disease , 2012, The Lancet Neurology.

[19]  P. Visscher,et al.  Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores , 2015, bioRxiv.

[20]  N. Graff-Radford,et al.  Maternal Transmission of Alzheimer Disease , 2012, Alzheimer disease and associated disorders.

[21]  Sterling C. Johnson,et al.  Sex-Specific Association of Apolipoprotein E With Cerebrospinal Fluid Levels of Tau , 2018, JAMA neurology.

[22]  R. Honea,et al.  Maternal family history is associated with Alzheimer's disease biomarkers. , 2012, Journal of Alzheimer's disease : JAD.

[23]  P. Visscher,et al.  Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index , 2015, Nature Genetics.

[24]  D. G. Clark,et al.  Common variants in MS4A4/MS4A6E, CD2uAP, CD33, and EPHA1 are associated with late-onset Alzheimer’s disease , 2011, Nature Genetics.

[25]  Nick C Fox,et al.  Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease , 2013, Nature Genetics.

[26]  Shane A. McCarthy,et al.  Reference-based phasing using the Haplotype Reference Consortium panel , 2016, Nature Genetics.

[27]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[28]  D. Bennett,et al.  Genetic data and cognitively-defined late-onset Alzheimer’s disease subgroups , 2018, bioRxiv.

[29]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[30]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[31]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[32]  P. Donnelly,et al.  Genome-wide genetic data on ~500,000 UK Biobank participants , 2017, bioRxiv.

[33]  Kevin L. Boehme,et al.  Associations between Potentially Modifiable Risk Factors and Alzheimer Disease: A Mendelian Randomization Study , 2015, PLoS medicine.

[34]  B. Hermann,et al.  Middle-Aged Children of Persons With Alzheimer’s Disease: APOE Genotypes and Cognitive Function in the Wisconsin Registry for Alzheimer’s Prevention , 2005, Journal of geriatric psychiatry and neurology.

[35]  M. Daly,et al.  LD Score regression distinguishes confounding from polygenicity in genome-wide association studies , 2014, Nature Genetics.

[36]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[37]  Olena O Yavorska,et al.  MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data , 2017, International journal of epidemiology.

[38]  Alzheimer's Disease Neuroimaging Initiative Genome-wide association study identifies four novel loci associated with Alzheimer's endophenotypes and disease modifiers. , 2017 .

[39]  R. Mayeux,et al.  Review - Part of the Special Issue: Alzheimer's Disease - Amyloid, Tau and Beyond Alzheimer disease: Epidemiology, diagnostic criteria, risk factors and biomarkers , 2014 .

[40]  J. Haines,et al.  Genome-Wide Association Study of Late-Onset Alzheimer Disease Identifies Disease-Associated Variants in MS4A4/MS4A6E, CD2AP, CD33, and EPHA1 , 2011, Alzheimer's & Dementia.

[41]  S. Larsson,et al.  Modifiable pathways in Alzheimer’s disease: Mendelian randomisation analysis , 2017, British Medical Journal.

[42]  Sterling C. Johnson,et al.  Intraindividual Cognitive Variability in Middle Age Predicts Cognitive Impairment 8–10 Years Later: Results from the Wisconsin Registry for Alzheimer’s Prevention , 2016, Journal of the International Neuropsychological Society.

[43]  Nick C Fox,et al.  Common variants in ABCA7, MS4A6A/MS4A4E, EPHA1, CD33 and CD2AP are associated with Alzheimer’s disease , 2011, Nature Genetics.

[44]  S. Grant,et al.  Mendelian randomization in the era of genomewide association studies. , 2010, Clinical chemistry.

[45]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[46]  Alan M. Kwong,et al.  A reference panel of 64,976 haplotypes for genotype imputation , 2015, Nature Genetics.

[47]  C. DeCarli,et al.  Associations between serum cholesterol levels and cerebral amyloidosis. , 2014, JAMA neurology.

[48]  Margaret A. Pericak-Vance,et al.  Genome-Wide Association Meta-analysis of Neuropathologic Features of Alzheimer's Disease and Related Dementias , 2014, PLoS genetics.

[49]  G. Davey Smith,et al.  Mendelian randomization: genetic anchors for causal inference in epidemiological studies , 2014, Human molecular genetics.

[50]  Joanne Rich,et al.  Genetic data and cognitively defined late-onset Alzheimer’s disease subgroups , 2018, bioRxiv.

[51]  P. Sachdev,et al.  Brain reserve and dementia: a systematic review , 2005, Psychological Medicine.

[52]  Timothy A Thornton,et al.  Robust Inference of Population Structure for Ancestry Prediction and Correction of Stratification in the Presence of Relatedness , 2015, Genetic epidemiology.

[53]  Sterling C. Johnson,et al.  Sex-specific genetic predictors of Alzheimer’s disease biomarkers , 2018, Acta Neuropathologica.

[54]  Sterling C. Johnson,et al.  The Wisconsin Registry for Alzheimer's Prevention: A review of findings and current directions , 2017, bioRxiv.

[55]  Kristine Yaffe,et al.  Potential for primary prevention of Alzheimer's disease: an analysis of population-based data , 2014, The Lancet Neurology.

[56]  K. Lunetta,et al.  Transethnic genome-wide scan identifies novel Alzheimer's disease loci , 2017, Alzheimer's & Dementia.

[57]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.