Shared activity patterns arising at genetic susceptibility loci reveal underlying genomic and cellular architecture of human disease

Genetic variants underlying complex traits, including disease susceptibility, are enriched within the transcriptional regulatory elements, promoters and enhancers. There is emerging evidence that regulatory elements associated with particular traits or diseases share patterns of transcriptional regulation. Accordingly, shared transcriptional regulation (coexpression) may help prioritise loci associated with a given trait, and help to identify the biological processes underlying it. Using cap analysis of gene expression (CAGE) profiles of promoter and enhancer-derived RNAs across 1824 human samples, we have quantified coexpression of RNAs originating from trait-associated regulatory regions using a novel analytical method (network density analysis; NDA). For most traits studied, sequence variants in regulatory regions were linked to tightly coexpressed networks that are likely to share important functional characteristics. These networks implicate particular cell types and tissues in disease pathogenesis; for example, variants associated with ulcerative colitis are linked to expression in gut tissue, whereas Crohn’s disease variants are restricted to immune cells. We show that this coexpression signal provides additional independent information for fine mapping likely causative variants. This approach identifies additional genetic variants associated with specific traits, including an association between the regulation of the OCT1 cation transporter and genetic variants underlying circulating cholesterol levels. This approach enables a deeper biological understanding of the causal basis of complex traits. ONE SENTENCE SUMMARY We discover that variants associated with a specific disease share expression profiles across tissues and cell types, enabling fine mapping and identification of new disease-associated variants, illuminating key cell types involved in disease pathogenesis.

[1]  Tariq Ahmad,et al.  Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci , 2010, Nature Genetics.

[2]  Tom C Freeman,et al.  Meta-analysis of lineage-specific gene expression signatures in mouse leukocyte populations. , 2010, Immunobiology.

[3]  S. Salzberg,et al.  The Transcriptional Landscape of the Mammalian Genome , 2005, Science.

[4]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[5]  John Kurhanewicz,et al.  OCT1 is a high-capacity thiamine transporter that regulates hepatic steatosis and is a target of metformin , 2014, Proceedings of the National Academy of Sciences.

[6]  Curtis D. Klaassen,et al.  Xenobiotic, Bile Acid, and Cholesterol Transporters: Function and Regulation , 2010, Pharmacological Reviews.

[7]  Tariq Ahmad,et al.  Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47 , 2011, Nature Genetics.

[8]  Tanya M. Teslovich,et al.  The Metabochip, a Custom Genotyping Array for Genetic Studies of Metabolic, Cardiovascular, and Anthropometric Traits , 2012, PLoS genetics.

[9]  Daniel Marbach,et al.  Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases , 2016, Nature Methods.

[10]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[11]  R. DePinho,et al.  The Kinase LKB1 Mediates Glucose Homeostasis in Liver and Therapeutic Effects of Metformin , 2005, Science.

[12]  Martin S. Taylor,et al.  Genome-wide analysis of mammalian promoter architecture and evolution , 2006, Nature Genetics.

[13]  Han Xu,et al.  Partitioning heritability by functional category using GWAS summary statistics , 2015, bioRxiv.

[14]  Timothy J. Durham,et al.  Systematic analysis of chromatin state dynamics in nine human cell types , 2011, Nature.

[15]  Vera Ribeiro,et al.  The expression of the solute carriers NTCP and OCT‐1 is regulated by cholesterol in HepG2 cells , 2007, Fundamental & clinical pharmacology.

[16]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[17]  Sven Laur,et al.  Robust rank aggregation for gene list integration and meta-analysis , 2012, Bioinform..

[18]  C. Haley,et al.  The heritability of human disease: estimation, uses and abuses , 2013, Nature Reviews Genetics.

[19]  I. M. MacLeod,et al.  Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits , 2016, BMC Genomics.

[20]  Christian Gieger,et al.  Genetic Variants in Novel Pathways Influence Blood Pressure and Cardiovascular Disease Risk , 2011, Nature.

[21]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[22]  Thomas J. Ha,et al.  Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells , 2015, Science.

[23]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.

[24]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[25]  M. Daly,et al.  Proteins Encoded in Genomic Regions Associated with Immune-Mediated Disease Physically Interact and Suggest Underlying Biology , 2011, PLoS genetics.

[26]  R. Xavier,et al.  Unravelling the pathogenesis of inflammatory bowel disease , 2007, Nature.

[27]  Sarah Edkins,et al.  Dense genotyping identifies and localizes multiple common and rare variant association signals in celiac disease , 2011, Nature Genetics.

[28]  M. Daly,et al.  Genetic and Epigenetic Fine-Mapping of Causal Autoimmune Disease Variants , 2014, Nature.

[29]  Christian Gieger,et al.  Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture , 2013, Nature Genetics.

[30]  P. Schnohr,et al.  Association of mutations in the apolipoprotein B gene with hypercholesterolemia and the risk of ischemic heart disease. , 1998, The New England journal of medicine.

[31]  J. Kenneth Baillie,et al.  Targeting the host immune response to fight infection , 2014, Science.

[32]  Pak Chung Sham,et al.  GWASdb: a database for human genetic variants identified by genome-wide association studies , 2011, Nucleic Acids Res..

[33]  David Haussler,et al.  The UCSC Genome Browser database: update 2010 , 2009, Nucleic Acids Res..

[34]  Sobia Raza,et al.  Functional clustering and lineage markers: insights into cellular differentiation and gene function from large-scale microarray studies of purified primary cell populations. , 2010, Genomics.

[35]  J. Hirschhorn,et al.  Biological interpretation of genome-wide association studies using predicted gene functions , 2015, Nature Communications.

[36]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[37]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[38]  M. Daly,et al.  Identifying Relationships among Genomic Disease Regions: Predicting Genes at Pathogenic SNP Associations and Rare Deletions , 2009, PLoS genetics.

[39]  H. Mekhjian,et al.  Clinical features and natural history of Crohn's disease. , 1979, Gastroenterology.

[40]  G. Kreiman,et al.  Widespread transcription at neuronal activity-regulated enhancers , 2010, Nature.

[41]  W. Kao,et al.  Relative performance of gene- and pathway-level methods as secondary analyses for genome-wide association studies , 2015, BMC Genetics.

[42]  Sangsoo Kim,et al.  GSA-SNP: a general approach for gene set analysis of polymorphisms , 2010, Nucleic Acids Res..

[43]  A. Sandelin,et al.  Metazoan promoters: emerging characteristics and insights into transcriptional regulation , 2012, Nature Reviews Genetics.

[44]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[45]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[46]  Anand K. Srivastava,et al.  Identification of a gene, ABCG5, important in the regulation of dietary cholesterol absorption , 2001, Nature Genetics.