A global atlas of genetic associations of 220 deep phenotypes

The current genome-wide association studies (GWASs) do not yet capture sufficient diversity in terms of populations and scope of phenotypes. To address an essential need to expand an atlas of genetic associations in non-European populations, we conducted 220 deep-phenotype GWASs (disease endpoints, biomarkers, and medication usage) in BioBank Japan (n = 179,000), by incorporating past medical history and text-mining results of electronic medical records. Meta-analyses with the harmonized phenotypes in the UK Biobank and FinnGen (ntotal = 628,000) identified over 4,000 novel loci, which substantially deepened the resolution of the genomic map of human traits, benefited from East Asian endemic diseases and East Asian specific variants. This atlas elucidated the globally shared landscape of pleiotropy as represented by the MHC locus, where we conducted fine-mapping by HLA imputation. Finally, to intensify the value of deep-phenotype GWASs, we performed statistical decomposition of matrices of phenome-wide summary statistics, and identified the latent genetic components, which pinpointed the responsible variants and shared biological mechanisms underlying current disease classifications across populations. The decomposed components enabled genetically informed subtyping of similar diseases (e.g., allergic diseases). Our study suggests a potential avenue for hypothesis-free re-investigation of human disease classifications through genetics.

[1]  Fidencio J. Neri,et al.  Index and biological spectrum of human DNase I hypersensitive sites , 2020, Nature.

[2]  K. Matsuo,et al.  ABO blood group alleles and the risk of pancreatic cancer in a Japanese population , 2011, Cancer science.

[3]  Olga V. Demler,et al.  Pleiotropy-Based Decomposition of Genetic Risk Scores: Association and Interaction Analysis for Type 2 Diabetes and CAD. , 2020, American journal of human genetics.

[4]  Janina M. Jeff,et al.  A phenome-wide association study (PheWAS) in the Population Architecture using Genomics and Epidemiology (PAGE) study reveals potential pleiotropy in African Americans , 2019, PloS one.

[5]  O. Andreassen,et al.  A global overview of pleiotropy and genetic architecture in complex traits , 2019, Nature Genetics.

[6]  Y. Kamatani,et al.  Overview of the BioBank Japan Project: Study design and profile , 2017, Journal of epidemiology.

[7]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[8]  Evan Z. Macosko,et al.  Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types , 2017, Nature Genetics.

[9]  Marylyn D. Ritchie,et al.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations , 2010, Bioinform..

[10]  Brielin C. Brown,et al.  Transethnic genetic correlation estimates from summary statistics , 2016, bioRxiv.

[11]  M. Daly,et al.  Trans-biobank analysis with 676,000 individuals elucidates the association of polygenic risk scores of complex traits with human lifespan , 2020, Nature Medicine.

[12]  Alicia R. Martin,et al.  Clinical use of current polygenic risk scores may exacerbate health disparities , 2019, Nature Genetics.

[13]  Seizo Koshiba [Jananese Multi Omics Reference Panel]. , 2016, Seikagaku. The Journal of Japanese Biochemical Society.

[14]  D. Berger A brief history of medical diagnosis and the birth of the clinical laboratory. Part 1--Ancient times through the 19th century. , 1999, MLO: medical laboratory observer.

[15]  Biao Xu,et al.  Discovery of susceptibility loci associated with tuberculosis in Han Chinese , 2017, Human molecular genetics.

[16]  Kengo Kinoshita,et al.  jMorp: Japanese Multi Omics Reference Panel , 2017, Nucleic Acids Res..

[17]  Andrew P. Boughton,et al.  Exploring and visualizing large-scale genetic associations by using PheWeb , 2020, Nature Genetics.

[18]  P. Portincasa,et al.  Intestinal absorption, hepatic synthesis, and biliary secretion of cholesterol: Where are we for cholesterol gallstone formation? , 2012, Hepatology.

[19]  Melissa A. Basford,et al.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data , 2013, Nature Biotechnology.

[20]  Nick C Fox,et al.  Analysis of shared heritability in common disorders of the brain , 2018, Science.

[21]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[22]  Lars G Fritsche,et al.  Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies , 2017, Nature Genetics.

[23]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[24]  Tom R. Gaunt,et al.  LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis , 2016, bioRxiv.

[25]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[26]  John P. Overington,et al.  An atlas of genetic influences on human blood metabolites , 2014, Nature Genetics.

[27]  P. Sachs,et al.  SMARCAD1 ATPase activity is required to silence endogenous retroviruses in embryonic stem cells , 2019, Nature Communications.

[28]  Buhm Han,et al.  Imputing Amino Acid Polymorphisms in Human Leukocyte Antigens , 2013, PloS one.

[29]  M. Kanai,et al.  Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases , 2018, Nature Genetics.

[30]  Bogdan Pasaniuc,et al.  Local genetic correlation gives insights into the shared genetic architecture of complex traits , 2016, bioRxiv.

[31]  Robert M. Plenge,et al.  Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis , 2011, Nature Genetics.

[32]  A. Antonelli,et al.  The Association of Sjögren Syndrome and Autoimmune Thyroid Disorders , 2018, Front. Endocrinol..

[33]  D. Vessey The biochemical basis for the conjugation of bile acids with either glycine or taurine. , 1978, The Biochemical journal.

[34]  Zoltán Kutalik,et al.  Evaluation and application of summary statistic imputation to discover new height-associated loci , 2017, bioRxiv.

[35]  Bogdan Pasaniuc,et al.  Local genetic correlation gives insights into the shared genetic architecture of complex traits , 2016, bioRxiv.

[36]  M. Kanai,et al.  Genetic and phenotypic landscape of the major histocompatibilty complex region in the Japanese population , 2019, Nature Genetics.

[37]  M. Melbye,et al.  Risk of gastric cancer and peptic ulcers in relation to ABO blood type: a cohort study. , 2010, American journal of epidemiology.

[38]  Masa Umicevic Mirkov,et al.  A global overview of pleiotropy and genetic architecture in complex traits , 2018, bioRxiv.

[39]  Hiroshi Tanaka,et al.  The Tohoku Medical Megabank Project: Design and Mission. , 2016, Journal of epidemiology.

[40]  M. Daly,et al.  Trans-biobank analysis with 676,000 individuals elucidates the association of polygenic risk scores of complex traits with human lifespan , 2019, bioRxiv.

[41]  Kazuhiko Yamamoto,et al.  Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese , 2018, Nature Communications.

[42]  B. Neale,et al.  Linkage disequilibrium dependent architecture of human complex traits reveals action of negative selection , 2016, bioRxiv.

[43]  Irina M. Armean,et al.  The mutational constraint spectrum quantified from variation in 141,456 humans , 2019, Nature.

[44]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[45]  Melissa A. Basford,et al.  Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. , 2010, American journal of human genetics.

[46]  Calman Prussin,et al.  IgE, mast cells, basophils, and eosinophils. , 2003, The Journal of allergy and clinical immunology.

[47]  B. Berger,et al.  Efficient Bayesian mixed model analysis increases association power in large cohorts , 2014, Nature Genetics.

[48]  M. Kanai,et al.  Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases , 2020, Nature Genetics.

[49]  M. Daly,et al.  LD Score regression distinguishes confounding from polygenicity in genome-wide association studies , 2014, Nature Genetics.

[50]  P. Donnelly,et al.  The UK Biobank resource with deep phenotyping and genomic data , 2018, Nature.

[51]  P. Visscher,et al.  Genome-wide association study of medication-use and associated disease in the UK Biobank , 2019, Nature Communications.

[52]  M. Daly,et al.  An Atlas of Genetic Correlations across Human Diseases and Traits , 2015, Nature Genetics.

[53]  M. Kanai,et al.  Characterizing rare and low-frequency height-associated variants in the Japanese population , 2019, Nature Communications.

[54]  K. Kaneda,et al.  Immunopathogenesis of delayed‐type hypersensitivity , 2001, Microscopy research and technique.

[55]  Christopher M. DeBoever,et al.  Components of genetic associations across 2,138 phenotypes in the UK Biobank highlight adipocyte biology , 2019, Nature Communications.

[56]  Jun S. Liu,et al.  Genetics of rheumatoid arthritis contributes to biology and drug discovery , 2013 .

[57]  K. Rawlik,et al.  An atlas of genetic associations in UK Biobank , 2017, Nature Genetics.

[58]  Y. Kitamura,et al.  Trends in incidence and mortality of tuberculosis in Japan: a population-based study, 1997–2016 , 2018, Epidemiology and Infection.

[59]  Bonnie Berger,et al.  Efficient Bayesian mixed model analysis increases association power in large cohorts , 2014 .

[60]  Kyle J. Gaulton,et al.  Detection of human adaptation during the past 2000 years , 2016, Science.

[61]  D. MacArthur,et al.  The mutational constraint spectrum quantified from variation in 141,456 humans , 2020 .

[62]  Yu Zhang,et al.  PheWAS and Beyond: The Landscape of Associations with Medical Diagnoses and Clinical Measures across 38,662 Individuals from Geisinger. , 2018, American journal of human genetics.

[63]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[64]  Y. Kudoh,et al.  Current Status of Bacterial Diarrheal Diseases in Japan , 1985 .

[65]  Bjarni V. Halldórsson,et al.  HLA class II sequence variants influence tuberculosis risk in populations of European ancestry , 2016, Nature Genetics.

[66]  Yukinori Okada,et al.  GREP: genome for REPositioning drugs , 2019, Bioinform..

[67]  Mark I. McCarthy,et al.  A brief history of human disease genetics , 2020, Nature.