Rare Splice Variants in Long Non-Coding RNAs

Long non-coding RNAs (lncRNAs) form a substantial component of the transcriptome and are involved in a wide variety of regulatory mechanisms. Compared to protein-coding genes, they are often expressed at low levels and are restricted to a narrow range of cell types or developmental stages. As a consequence, the diversity of their isoforms is still far from being recorded and catalogued in its entirety, and the debate is ongoing about what fraction of non-coding RNAs truly conveys biological function rather than being “junk”. Here, using a collection of more than 100 transcriptomes from related B cell lymphoma, we show that lncRNA loci produce a very defined set of splice variants. While some of them are so rare that they become recognizable only in the superposition of dozens or hundreds of transcriptome datasets and not infrequently include introns or exons that have not been included in available genome annotation data, there is still a very limited number of processing products for any given locus. The combined depth of our sequencing data is large enough to effectively exhaust the isoform diversity: the overwhelming majority of splice junctions that are observed at all are represented by multiple junction-spanning reads. We conclude that the human transcriptome produces virtually no background of RNAs that are processed at effectively random positions, but is—under normal circumstances—confined to a well defined set of splice variants.

[1]  Roderic Guigo,et al.  LncATLAS database for subcellular localization of long noncoding RNAs , 2017, bioRxiv.

[2]  R. Guigó,et al.  LncATLAS database for subcellular localisation of long noncoding RNAs , 2017, bioRxiv.

[3]  Jordan A. Ramilowski,et al.  An atlas of human long non-coding RNAs with accurate 5′ ends , 2017, Nature.

[4]  Edith Heard,et al.  Novel players in X inactivation: insights into Xist-mediated gene silencing and chromosome conformation , 2017, Nature Structural &Molecular Biology.

[5]  W. Wu,et al.  HULC: an oncogenic long non‐coding RNA in human cancer , 2016, Journal of cellular and molecular medicine.

[6]  Y. Mo,et al.  MALAT1-mediated tumorigenesis. , 2017, Frontiers in bioscience.

[7]  R. F. Luco Retrotransposons jump into alternative-splicing regulation via a long noncoding RNA , 2016, Nature Structural &Molecular Biology.

[8]  Zheng Li,et al.  TUG1: a pivotal oncogenic long non‐coding RNA of human cancers , 2016, Cell proliferation.

[9]  Zheng Li,et al.  ANRIL: a pivotal tumor suppressor long non-coding RNA in human cancers , 2016, Tumor Biology.

[10]  Yangqiu Li,et al.  HoxBlinc RNA Recruits Set1/MLL Complexes to Activate Hox Gene Expression Patterns and Mesoderm Lineage Development. , 2016, Cell reports.

[11]  F. Aguilo,et al.  Long Non-coding RNA ANRIL and Polycomb in Human Cancers and Cardiovascular Disease. , 2015, Current topics in microbiology and immunology.

[12]  Wei Wu,et al.  NONCODE 2016: an informative and valuable data source of long non-coding RNAs , 2015, Nucleic Acids Res..

[13]  K. Arman,et al.  A novel variable exonic region and differential expression of LINC00663 non-coding RNA in various cancer cell lines and normal human tissue samples , 2016, Tumor Biology.

[14]  P. Stadler,et al.  Evolution of the unspliced transcriptome , 2015, BMC Evolutionary Biology.

[15]  Manolis Kellis,et al.  Evolutionary dynamics and tissue specificity of human long noncoding RNAs in six mammals , 2014, Genome research.

[16]  Daniel R. Zerbino,et al.  Ensembl 2014 , 2013, Nucleic Acids Res..

[17]  Andrea Tanzer,et al.  A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection , 2014, Genome Biology.

[18]  Peter F. Stadler,et al.  Alu Elements in ANRIL Non-Coding RNA at Chromosome 9p21 Modulate Atherogenic Cell Functions through Trans-Regulation of Gene Networks , 2013, PLoS genetics.

[19]  R. Spang,et al.  Recurrent mutation of the ID3 gene in Burkitt lymphoma identified by integrated genome, exome and transcriptome sequencing , 2012, Nature Genetics.

[20]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[21]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[22]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[23]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[24]  Xinchen Wang,et al.  Tissue-specific alternative splicing remodels protein-protein interaction networks. , 2012, Molecular cell.

[25]  J. Rinn,et al.  Modular regulatory principles of large non-coding RNAs , 2012, Nature.

[26]  Gautier Koscielny,et al.  Ensembl 2012 , 2011, Nucleic Acids Res..

[27]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[28]  Paulo P. Amaral,et al.  The Reality of Pervasive Transcription , 2011, PLoS biology.

[29]  Tim R. Mercer,et al.  Expression of distinct RNAs from 3′ untranslated regions , 2010, Nucleic acids research.

[30]  Peter F. Stadler,et al.  Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures , 2009, PLoS Comput. Biol..

[31]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[32]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[33]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.