Genetic regulatory signatures underlying islet gene expression and type 2 diabetes

Significance The majority of genetic variants associated with type 2 diabetes (T2D) are located outside of genes in noncoding regions that may regulate gene expression in disease-relevant tissues, like pancreatic islets. Here, we present the largest integrated analysis to date of high-resolution, high-throughput human islet molecular profiling data to characterize the genome (DNA), epigenome (DNA packaging), and transcriptome (gene expression). We find that T2D genetic variants are enriched in regions of the genome where transcription Regulatory Factor X (RFX) is predicted to bind in an islet-specific manner. Genetic variants that increase T2D risk are predicted to disrupt RFX binding, providing a molecular mechanism to explain how the genome can influence the epigenome, modulating gene expression and ultimately T2D risk. Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATAC-seq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition.

[1]  M. Lazdunski,et al.  Genomic and functional characteristics of novel human pancreatic 2P domain K(+) channels. , 2001, Biochemical and biophysical research communications.

[2]  Danwei Huangfu,et al.  Genome Editing of Lineage Determinants in Human Pluripotent Stem Cells Reveals Mechanisms of Pancreatic Development and Diabetes. , 2016, Cell stem cell.

[3]  Christopher D. Brown,et al.  Coordinated Regulatory Variation Associated with Gestational Hyperglycemia Regulates Expression of the Novel Hexokinase HKDC1 , 2014, Nature Communications.

[4]  J. Miyazaki,et al.  Establishment of a pancreatic beta cell line that retains glucose-inducible insulin secretion: special reference to expression of glucose transporter isoforms. , 1990, Endocrinology.

[5]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[6]  Timothy J. Durham,et al.  Systematic analysis of chromatin state dynamics in nine human cell types , 2011, Nature.

[7]  B. Raaka,et al.  Epithelial-to-Mesenchymal Transition Generates Proliferative Human Islet Precursor Cells , 2004, Science.

[8]  Manolis Kellis,et al.  ChromHMM: automating chromatin-state discovery and characterization , 2012, Nature Methods.

[9]  Buhm Han,et al.  Chromatin marks identify critical cell types for fine mapping complex trait variants , 2012 .

[10]  David A. Knowles,et al.  Characterization of functional methylomes by next-generation capture sequencing identifies novel disease-associated variants , 2015, Nature Communications.

[11]  Christian Gieger,et al.  Genetic fine-mapping and genomic annotation defines causal mechanisms at type 2 diabetes susceptibility loci , 2016 .

[12]  Nansheng Chen,et al.  Identification and characterization of novel human tissue-specific RFX transcription factors , 2008, BMC Evolutionary Biology.

[13]  Mark I. McCarthy,et al.  Transcript Expression Data from Human Islets Links Regulatory Signals from Genome-Wide Association Studies for Type 2 Diabetes and Glycemic Traits to Their Downstream Effectors , 2015, PLoS genetics.

[14]  Richard A Young,et al.  Models of human core transcriptional regulatory circuitries , 2016, Genome research.

[15]  William Stafford Noble,et al.  Quantifying similarity between motifs , 2007, Genome Biology.

[16]  G. Abecasis,et al.  Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. , 2012, American journal of human genetics.

[17]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[18]  Stephen C. J. Parker,et al.  Motif signatures in stretch enhancers are enriched for disease-associated genetic variants , 2015, Epigenetics & Chromatin.

[19]  M. Daly,et al.  Genetic and Epigenetic Fine-Mapping of Causal Autoimmune Disease Variants , 2014, Nature.

[20]  Manolis Kellis,et al.  Discovery and characterization of chromatin states for systematic annotation of the human genome , 2010, Nature Biotechnology.

[21]  Eric S. Lander,et al.  Comparative Epigenomic Analysis of Murine and Human Adipogenesis , 2010, Cell.

[22]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[23]  Christian Fuchsberger,et al.  A common functional regulatory variant at a type 2 diabetes locus upregulates ARAP1 expression in the pancreatic beta cell. , 2014, American journal of human genetics.

[24]  Jacob F. Degner,et al.  Sequence and Chromatin Accessibility Data Accurate Inference of Transcription Factor Binding from Dna Material Supplemental Open Access , 2022 .

[25]  C. Orvain,et al.  Rfx6 is an Ngn3-dependent winged helix transcription factor required for pancreatic islet cell development , 2010, Development.

[26]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.

[28]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[29]  M. Gerstein,et al.  AlleleSeq: analysis of allele-specific expression and binding in a network framework , 2011, Molecular systems biology.

[30]  D. Lotshaw Biophysical, pharmacological, and functional characteristics of cloned and native mammalian two-pore domain K+ channels , 2007, Cell Biochemistry and Biophysics.

[31]  Josyf Mychaleckyj,et al.  Robust relationship inference in genome-wide association studies , 2010, Bioinform..

[32]  Andrey A. Shabalin,et al.  Matrix eQTL: ultra fast eQTL analysis via large matrix operations , 2011, Bioinform..

[33]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[34]  M. Boehnke,et al.  Recent advances in understanding the genetic architecture of type 2 diabetes. , 2015, Human molecular genetics.

[35]  J. Ilonen,et al.  Clinical, Genetic, and Biochemical Characteristics of Early-Onset Diabetes in the Finnish Population. , 2016, The Journal of clinical endocrinology and metabolism.

[36]  Leopold Parts,et al.  A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies , 2010, PLoS Comput. Biol..

[37]  P. Rorsman,et al.  RFX6 regulates insulin secretion by modulating Ca2+ homeostasis in human β cells. , 2014, Cell reports.

[38]  Ji Zhang,et al.  GREGOR: evaluating global enrichment of trait-associated variants in epigenomic features using a systematic, data-driven approach , 2015, Bioinform..

[39]  Stephen Hartley,et al.  QoRTs: a comprehensive toolset for quality control and data processing of RNA-Seq experiments , 2015, BMC Bioinformatics.

[40]  Cynthia A. Kalita,et al.  Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? , 2016, PLoS genetics.

[41]  Laura J. Scott,et al.  The genetic regulatory signature of type 2 diabetes in human skeletal muscle , 2016, Nature Communications.

[42]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[43]  Matthew T. Dickerson,et al.  Type 2 Diabetes–Associated K+ Channel TALK-1 Modulates β-Cell Electrical Excitability, Second-Phase Insulin Secretion, and Glucose Homeostasis , 2015, Diabetes.

[44]  Jonathan K. Pritchard,et al.  WASP: allele-specific software for robust molecular quantitative trait locus discovery , 2015, Nature Methods.

[45]  K. Dewar,et al.  Rfx6 Directs Islet Formation and Insulin Production in Mice and Humans , 2009, Nature.

[46]  L. Berthiaume,et al.  Wnt acylation: seeing is believing. , 2014, Nature chemical biology.

[47]  O. Delaneau,et al.  Supplementary Information for ‘ Improved whole chromosome phasing for disease and population genetic studies ’ , 2012 .

[48]  L. Groop,et al.  Global genomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism , 2014, Proceedings of the National Academy of Sciences.

[49]  Stephen C. J. Parker,et al.  Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants , 2013, Proceedings of the National Academy of Sciences.

[50]  Eric Haugen,et al.  Large-scale identification of sequence variants impacting human transcription factor occupancy in vivo , 2015, Nature Genetics.

[51]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[52]  R. Durbin,et al.  Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses , 2012, Nature Protocols.

[53]  J. Marchini,et al.  Fast and accurate genotype imputation in genome-wide association studies through pre-phasing , 2012, Nature Genetics.

[54]  Gonçalo R. Abecasis,et al.  Minimac2: Faster Genotype Imputation , 2015, Bioinform..

[55]  Mark I. McCarthy,et al.  Pancreatic islet enhancer clusters enriched in type 2 diabetes risk–associated variants , 2013, Nature Genetics.

[56]  G. Rutter,et al.  Rfx6 Maintains the Functional Identity of Adult Pancreatic β Cells , 2014, Cell reports.

[57]  Piero Carninci,et al.  Mapping Mammalian Cell-type-specific Transcriptional Regulatory Networks Using KD-CAGE and ChIP-seq Data in the TC-YIK Cell Line , 2015, Front. Genet..

[58]  M. Lupien,et al.  Combinatorial effects of multiple enhancer variants in linkage disequilibrium dictate levels of gene expression to confer susceptibility to common traits , 2014, Genome research.