Massively parallel characterization of transcriptional regulatory elements in three diverse human cell types

The human genome contains millions of candidate cis-regulatory elements (CREs) with cell-type-specific activities that shape both health and myriad disease states. However, we lack a functional understanding of the sequence features that control the activity and cell-type-specific features of these CREs. Here, we used lentivirus-based massively parallel reporter assays (lentiMPRAs) to test the regulatory activity of over 680,000 sequences, representing a nearly comprehensive set of all annotated CREs among three cell types (HepG2, K562, and WTC11), finding 41.7% to be functional. By testing sequences in both orientations, we find promoters to have significant strand orientation effects. We also observe that their 200 nucleotide cores function as non-cell-type-specific ‘on switches’ providing similar expression levels to their associated gene. In contrast, enhancers have weaker orientation effects, but increased tissue-specific characteristics. Utilizing our lentiMPRA data, we develop sequence-based models to predict CRE function with high accuracy and delineate regulatory motifs. Testing an additional lentiMPRA library encompassing 60,000 CREs in all three cell types, we further identified factors that determine cell-type specificity. Collectively, our work provides an exhaustive catalog of functional CREs in three widely used cell lines, and showcases how large-scale functional measurements can be used to dissect regulatory grammar.

[1]  Pawel F. Przytycki,et al.  Massively parallel characterization of psychiatric disorder-associated and cell-type-specific regulatory elements in the developing human cortex , 2023, bioRxiv.

[2]  Elisabeth F. Heuston,et al.  "Stripe" transcription factors provide accessibility to co-binding partners in mammalian genomes. , 2022, Molecular cell.

[3]  David R. Kelley,et al.  The genetic and biochemical determinants of mRNA degradation rates in mammals , 2022, bioRxiv.

[4]  Rafael Riudavets Puig,et al.  JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles , 2021, Nucleic Acids Res..

[5]  P. Cramer,et al.  Sequence determinants of human gene regulatory elements , 2021, Nature Genetics.

[6]  Anat Kreimer,et al.  Massively parallel reporter perturbation assays uncover temporal regulatory architecture during neural differentiation , 2021, Nature Communications.

[7]  David R. Kelley,et al.  Effective gene expression prediction from sequence by integrating long-range interactions , 2021, Nature Methods.

[8]  S. Rhie,et al.  Molecular and computational approaches to map regulatory elements in 3D chromatin structure , 2021, Epigenetics & Chromatin.

[9]  Timothy L. Bailey,et al.  STREME: Accurate and versatile sequence motif discovery , 2020, bioRxiv.

[10]  Shaoqian Ma,et al.  Profiling chromatin regulatory landscape: insights into the development of ChIP-seq and ATAC-seq , 2020, Molecular Biomedicine.

[11]  Hunter B. Fraser,et al.  The cis-regulatory effects of modern human-specific variants , 2020, bioRxiv.

[12]  Jay Shendure,et al.  A systematic evaluation of the design and context dependencies of massively parallel reporter assays , 2020, Nature methods.

[13]  Fidencio J. Neri,et al.  Index and biological spectrum of human DNase I hypersensitive sites , 2020, Nature.

[14]  Nir Yosef,et al.  lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements , 2020, Nature Protocols.

[15]  Yu Zhang,et al.  The changing mouse embryo transcriptome at whole tissue and single-cell resolution , 2020, Nature.

[16]  Nir Yosef,et al.  Identification and Massively Parallel Characterization of Regulatory Elements Driving Neural Induction. , 2019, Cell stem cell.

[17]  Nir Yosef,et al.  Integration of multiple epigenomic marks improves prediction of variant impact in saturation mutagenesis reporter assay , 2019, Human mutation.

[18]  Jay Shendure,et al.  Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution , 2019, Nature Communications.

[19]  E. Segal,et al.  Systematic interrogation of human promoters , 2019, Genome research.

[20]  Jacob M. Schreiber,et al.  A Genome-wide Framework for Mapping Gene Regulation via Cellular Genetic Screens , 2019, Cell.

[21]  Ian C. McDowell,et al.  Human genome-wide measurement of drug-responsive regulatory activity , 2018, Nature Communications.

[22]  Avanti Shrikumar,et al.  Technical Note on Transcription Factor Motif Discovery from Importance Scores (TF-MoDISco) version 0.5.6.5 , 2018, 1811.00416.

[23]  F. Hormozdiari,et al.  Disease heritability enrichment of regulatory elements is concentrated in elements with ancient sequence age and conserved function across species , 2018, bioRxiv.

[24]  Jay Shendure,et al.  Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks , 2018, bioRxiv.

[25]  A. Shilatifard,et al.  Enhancer Logic and Mechanics in Development and Disease. , 2018, Trends in cell biology.

[26]  K. White,et al.  Functional assessment of human enhancer activities using whole-genome STARR-sequencing , 2017, Genome Biology.

[27]  Nadav Ahituv,et al.  Gene Regulatory Elements, Major Drivers of Human Disease. , 2017, Annual review of genomics and human genetics.

[28]  H. Bussemaker,et al.  Genome-wide mapping of autonomous promoter activity in human cells , 2016, Nature Biotechnology.

[29]  Alessandro Vullo,et al.  Ensembl 2017 , 2016, Nucleic Acids Res..

[30]  C. Bonifer,et al.  The Role of the Ubiquitously Expressed Transcription Factor Sp1 in Tissue-specific Transcriptional Regulation and in Disease , 2016, The Yale journal of biology and medicine.

[31]  Bing He,et al.  EnhancerAtlas: a resource for enhancer annotation and analysis in 105 human cell/tissue types , 2016, Bioinform..

[32]  T. Mikkelsen,et al.  Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions , 2016, Nature Biotechnology.

[33]  Sharon R Grossman,et al.  Systematic mapping of functional enhancer–promoter connections with CRISPR interference , 2016, Science.

[34]  Michael T. McManus,et al.  A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity , 2016, bioRxiv.

[35]  William Stafford Noble,et al.  Choosing panels of genomics assays using submodular optimization , 2016, bioRxiv.

[36]  Timothy E. Reddy,et al.  Highly Specific Epigenome Editing by CRISPR/Cas9 Repressors for Silencing of Distal Regulatory Elements , 2015, Nature Methods.

[37]  N. Ahituv,et al.  Decoding enhancers using massively parallel reporter assays. , 2015, Genomics.

[38]  Derek W Wright,et al.  Gateways to the FANTOM5 promoter level mammalian expression atlas , 2015, Genome Biology.

[39]  Han Xu,et al.  Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. , 2014, American journal of human genetics.

[40]  B. Cohen,et al.  High-throughput functional testing of ENCODE segmentation predictions , 2014, Genome research.

[41]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[42]  H. Yoshida,et al.  Nuclear transcription factor Y and its roles in cellular processes related to human disease. , 2013, American journal of cancer research.

[43]  J. Shendure,et al.  Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model , 2013, Nature Genetics.

[44]  R. Young,et al.  Transcriptional Regulation and Its Misregulation in Disease , 2013, Cell.

[45]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[46]  Nathan C. Sheffield,et al.  The accessible chromatin landscape of the human genome , 2012, Nature.

[47]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[48]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[49]  V. Gopalakrishnan REST and the RESTless: in stem cells and beyond. , 2009, Future neurology.

[50]  Nathaniel D. Heintzman,et al.  Histone modifications at human enhancers reflect global cell-type-specific gene expression , 2009, Nature.

[51]  William Stafford Noble,et al.  Quantifying similarity between motifs , 2007, Genome Biology.

[52]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[53]  F. Grosveld,et al.  Definition of the minimal requirements within the human beta‐globin gene and the dominant control region for high level expression. , 1990, The EMBO journal.

[54]  J. Banerji,et al.  Expression of a beta-globin gene is enhanced by remote SV40 DNA sequences. , 1981, Cell.