Navigome: Navigating the Human Phenome

We now have access to a sufficient number of genome-wide association studies (GWAS) to cluster phenotypes into genetic-informed categories and to navigate the “phenome” space of human traits. Using a collection of 465 GWAS, we generated genetic correlations, pathways, gene-wise and tissue-wise associations using MAGMA and S-PrediXcan for 465 human traits. Testing 7267 biological pathways, we found that only 898 were significantly associated with any trait. Similarly, out of ~20,000 tested protein-coding genes, 12,311 genes exhibited an association. Based on the genetic correlations between all traits, we constructed a phenome map using t-distributed stochastic neighbor embedding (t-SNE), where each of the 465 traits can be visualized as an individual point. This map reveals well-defined clusters of traits such as education/high longevity, lower longevity, height, body composition, and depression/anxiety/neuroticism. These clusters are enriched in specific groups of pathways, such as lipid pathways in the lower longevity cluster, and neuronal pathways for body composition or education clusters. The map and all other analyses are available in the Navigome web interface (https://phenviz.navigome.com).

[1]  Gerome Breen,et al.  Psychiatric Genomics: An Update and an Agenda , 2017, bioRxiv.

[2]  Juan Carlos Fernández,et al.  Multiobjective evolutionary algorithms to identify highly autocorrelated areas: the case of spatial distribution in financially compromised farms , 2014, Ann. Oper. Res..

[3]  Colm O'Dushlaine,et al.  INRICH: interval-based enrichment analysis for genome-wide association studies , 2012, Bioinform..

[4]  Joris M. Mooij,et al.  MAGMA: Generalized Gene-Set Analysis of GWAS Data , 2015, PLoS Comput. Biol..

[5]  Tom R. Gaunt,et al.  LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis , 2016, bioRxiv.

[6]  Mary Goldman,et al.  Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics , 2016, Nature Communications.

[7]  Gilles Louppe,et al.  Independent consultant , 2013 .

[8]  M. Daly,et al.  An Atlas of Genetic Correlations across Human Diseases and Traits , 2015, Nature Genetics.

[9]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[10]  Jonathan J. Evans,et al.  Prevalence and Characteristics of Probable Major Depression and Bipolar Disorder within UK Biobank: Cross-Sectional Study of 172,751 Participants , 2013, PloS one.

[11]  J. Mesirov,et al.  The Molecular Signatures Database Hallmark Gene Set Collection , 2015 .

[12]  J. Mesirov,et al.  The Molecular Signatures Database (MSigDB) hallmark gene set collection. , 2015, Cell systems.

[13]  Todd L Edwards,et al.  Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics , 2018, Nature Communications.

[14]  Nils Y. Hammerla,et al.  Large Scale Population Assessment of Physical Activity Using Wrist Worn Accelerometers: The UK Biobank Study , 2017, PloS one.

[15]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[16]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[17]  P. Donnelly,et al.  Genome-wide genetic data on ~500,000 UK Biobank participants , 2017, bioRxiv.

[18]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[19]  D. Koller,et al.  Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals , 2013, Genome research.

[20]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .