High-throughput functional analysis of lncRNA core promoters elucidates rules governing tissue specificity

Transcription initiates at both coding and non-coding genomic elements, including mRNA and long non-coding RNA (lncRNA) core promoters and enhancer RNAs (eRNAs). However, each class has different expression profiles with lncRNAs and eRNAs being the most tissue-specific. How these complex differences in expression profiles and tissue-specificities are encoded in a single DNA sequence, however, remains unresolved. Here, we address this question using computational approaches and massively parallel reporter assays (MPRA) surveying hundreds of promoters and enhancers. We find that both divergent lncRNA and mRNA core promoters have higher capacities to drive transcription than non-divergent lncRNA and mRNA core promoters, respectively. Conversely, lincRNAs and eRNAs have lower capacities to drive transcription and are more tissue-specific than divergent genes. This higher tissue-specificity is strongly associated with having less complex TF motif profiles at the core promoter. We experimentally validated these findings by testing both engineered single-nucleotide deletions and human single-nucleotide polymorphisms (SNPs) in MPRA. In both cases, we observe that single nucleotides associated with many motifs are important drivers of promoter activity. Thus, we suggest that high TF motif density serves as a robust mechanism to increase promoter activity at the expense of tissue-specificity. Moreover, we find that 22% of common SNPs in core promoter regions have significant regulatory effects. Collectively, our findings show that high TF motif density provides redundancy and increases promoter activity at the expense of tissue specificity, suggesting that specificity of expression may be regulated by simplicity of motif usage.

[1]  S. Spicuglia,et al.  Widespread Enhancer Activity from Core Promoters. , 2018, Trends in biochemical sciences.

[2]  M. Bulyk,et al.  Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds. , 2017, Cell systems.

[3]  Jordan A. Ramilowski,et al.  An atlas of human long non-coding RNAs with accurate 5′ ends , 2017, Nature.

[4]  Jennifer Harrow,et al.  High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing , 2017, Nature Genetics.

[5]  J. Goodrich,et al.  Finding the start site: redefining the human initiator element , 2017, Genes & development.

[6]  Martina Rath,et al.  Genome-wide assessment of sequence-intrinsic enhancer responsiveness at single-base-pair resolution , 2016, Nature Biotechnology.

[7]  Mingming Jia,et al.  COSMIC: somatic cancer genetics at high-resolution , 2016, Nucleic Acids Res..

[8]  Tao Liu,et al.  Cistrome Data Browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse , 2016, Nucleic Acids Res..

[9]  J. Rinn,et al.  Chromatin environment, transcriptional regulation, and splicing distinguish lincRNAs and mRNAs , 2016, bioRxiv.

[10]  E. Lander,et al.  Local regulation of gene expression by lncRNA promoters, transcription and splicing , 2016, Nature.

[11]  Mihai Albu,et al.  Motif comparison based on similarity of binding affinity profiles , 2016, Bioinform..

[12]  Andreas R. Pfenning,et al.  High-throughput functional comparison of promoter and enhancer activities , 2016, Genome research.

[13]  Marc Robinson-Rechavi,et al.  A benchmark of gene expression tissue-specificity metrics , 2015, bioRxiv.

[14]  Vladimir Vacic,et al.  Genome‐wide association study of schizophrenia in Ashkenazi Jews , 2015, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[15]  Judy H. Cho,et al.  Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations , 2015, Nature Genetics.

[16]  G. Kempermann Faculty Opinions recommendation of Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. , 2015 .

[17]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[18]  田原 康玄,et al.  生活習慣病とgenome-wide association study , 2015 .

[19]  R. Andersson Promoter or enhancer, what's the difference? Deconstruction of established distinctions and presentation of a unifying model , 2015, BioEssays : news and reviews in molecular, cellular and developmental biology.

[20]  D. Corcoran,et al.  Human promoters are intrinsically directional. , 2015, Molecular cell.

[21]  Loyal A. Goff,et al.  DeCoN: Genome-wide Analysis of In Vivo Transcriptional Dynamics during Pyramidal Neuron Fate Selection in Neocortex , 2015, Neuron.

[22]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.

[23]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[24]  David J. Arenillas,et al.  JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles , 2013, Nucleic Acids Res..

[25]  Jonathan K. Pritchard,et al.  The Functional Consequences of Variation in Transcription Factor Binding , 2013, PLoS genetics.

[26]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[27]  Monika S. Kowalczyk,et al.  Chromatin signatures at transcriptional start sites separate two equally populated yet distinct classes of intergenic long noncoding RNAs , 2013, Genome Biology.

[28]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[29]  R. Shiekhattar,et al.  Long Noncoding RNAs Usher In a New Era in the Biology of Enhancers , 2013, Cell.

[30]  Albert E. Almada,et al.  Divergent transcription of long noncoding RNA/mRNA gene pairs in embryonic stem cells , 2013, Proceedings of the National Academy of Sciences.

[31]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[32]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[33]  Howard Y. Chang,et al.  Genome regulation by long noncoding RNAs. , 2012, Annual review of biochemistry.

[34]  Joseph B Hiatt,et al.  Massively parallel functional dissection of mammalian enhancers in vivo , 2012, Nature Biotechnology.

[35]  T. Mikkelsen,et al.  Rapid dissection and model-based optimization of inducible enhancers in human cells using a massively parallel reporter assay , 2012, Nature Biotechnology.

[36]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[37]  Manolis Kellis,et al.  PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions , 2011, Bioinform..

[38]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[39]  C. Piantadosi,et al.  Co-regulation of nuclear respiratory factor-1 by NFκB and CREB links LPS-induced inflammation to mitochondrial biogenesis , 2010, Journal of Cell Science.

[40]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[41]  Jay Shendure,et al.  High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis , 2009, Nature Biotechnology.

[42]  Leighton J. Core,et al.  Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters , 2008, Science.

[43]  Christopher I Amos,et al.  Common 5p15.33 and 6p21.33 variants influence lung cancer risk , 2008, Nature Genetics.

[44]  Eytan Domany,et al.  Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site , 2007, PloS one.

[45]  C. Ponting,et al.  Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. , 2007, Genome research.

[46]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[47]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[48]  R. Myers,et al.  An abundance of bidirectional promoters in the human genome. , 2003, Genome research.

[49]  E. Zabarovsky,et al.  Cloning of two candidate tumor suppressor genes within a 10 kb region on chromosome 13q14, frequently deleted in chronic lymphocytic leukemia , 1997, Oncogene.