Long Read Single-Molecule Real-Time Sequencing Elucidates Transcriptome-Wide Heterogeneity and Complexity in Esophageal Squamous Cells

Esophageal squamous cell carcinoma is a leading cause of cancer death. Mapping the transcriptional landscapes such as isoforms, fusion transcripts, as well as long noncoding RNAs have played a central role to understand the regulating mechanism during malignant processes. However, canonical methods such as short-read RNA-seq are difficult to define the entire polyadenylated RNA molecules. Here, we combined single-molecule real-time sequencing with RNA-seq to generate high-quality long reads and to survey the transcriptional program in esophageal squamous cells. Compared with the recent annotations of human transcriptome (Ensembl 38 release 91), single-molecule real-time data identified many unannotated transcripts, novel isoforms of known genes and an expanding repository of long intergenic noncoding RNAs (lincRNAs). By integrating with annotation of lincRNA catalog, 1,521 esophageal-cancer-specific lincRNAs were defined from single-molecule real-time reads. Kyoto Encyclopedia of Genes and Genomes enrichment analysis indicated that these lincRNAs and their target genes are involved in a variety of cancer signaling pathways. Isoform usage analysis revealed the shifted alternative splicing patterns, which can be recaptured from clinical samples or supported by previous studies. Utilizing vigorous searching criteria, we also detected multiple transcript fusions, which are not documented in current gene fusion database or readily identified from RNA-seq reads. Two novel fusion transcripts were verified based on real-time PCR and Sanger sequencing. Overall, our long-read single-molecule sequencing largely expands current understanding of full-length transcriptome in esophageal cells and provides novel insights on the transcriptional diversity during oncogenic transformation.

[1]  tyrosine kinase , 2020, Catalysis from A to Z.

[2]  Hong Chen,et al.  Analysis of Long Non-Coding RNA and mRNA Expression Profiling in Immature and Mature Bovine (Bos taurus) Testes , 2019, Front. Genet..

[3]  Zhihua Liu,et al.  Multi-region sequencing unveils novel actionable targets and spatial heterogeneity in esophageal squamous cell carcinoma , 2019, Nature Communications.

[4]  N. Sonenberg,et al.  HGF-induced migration depends on the PI(3,4,5)P3-binding microexon-spliced variant of the Arf6 exchange factor cytohesin-1 , 2018, The Journal of cell biology.

[5]  Jiagen Li,et al.  Survival-associated alternative splicing signatures in esophageal carcinoma , 2018, Carcinogenesis.

[6]  Joseph L. Dempsey,et al.  Coordinate regulation of long non-coding RNAs and protein-coding genes in germ-free mice , 2018, BMC Genomics.

[7]  I. Ng,et al.  IRAK1 Augments Cancer Stemness and Drug Resistance via the AP-1/AKR1B10 Signaling Cascade in Hepatocellular Carcinoma. , 2018, Cancer research.

[8]  Miha Skalic,et al.  SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions , 2016, Genome Biology.

[9]  E. Li,et al.  The interaction of lncRNA EZR-AS1 with SMYD3 maintains overexpression of EZR in ESCC cells , 2017, Nucleic acids research.

[10]  E. Nice,et al.  Functional Role of a Novel Long Noncoding RNA TTN-AS1 in Esophageal Squamous Cell Carcinoma Progression and Metastasis , 2017, Clinical Cancer Research.

[11]  E. Li,et al.  Natural antisense transcript TPM1-AS regulates the alternative splicing of tropomyosin I through an interaction with RNA-binding motif protein 4. , 2017, The international journal of biochemistry & cell biology.

[12]  S. Knapp,et al.  Alternative splicing promotes tumour aggressiveness and drug resistance in African American prostate cancer , 2017, Nature Communications.

[13]  E. Li,et al.  Integrative analyses of transcriptome sequencing identify novel functional lncRNAs in esophageal squamous cell carcinoma , 2017, Oncogenesis.

[14]  Benjamin J. Raphael,et al.  Integrated genomic characterization of oesophageal carcinoma , 2017, Nature.

[15]  Sanghyuk Lee,et al.  ChimerDB 3.0: an enhanced database for fusion genes from cancer transcriptome and literature data mining , 2016, Nucleic Acids Res..

[16]  Xin Xu,et al.  Spatial intratumor heterogeneity of genetic, epigenetic alterations and temporal clonal evolution in esophageal squamous cell carcinoma , 2016, Nature Genetics.

[17]  Faye D. Schilkey,et al.  A survey of the sorghum transcriptome using single-molecule long reads , 2016, Nature Communications.

[18]  W. Shen,et al.  LncRNAs and Esophageal Squamous Cell Carcinoma - Implications for Pathogenesis and Drug Development , 2016, Journal of Cancer.

[19]  A. Jemal,et al.  Cancer statistics in China, 2015 , 2016, CA: a cancer journal for clinicians.

[20]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[21]  Cai,et al.  Spatial intratumoral heterogeneity and temporal clonal evolution in esophageal squamous cell carcinoma. , 2016 .

[22]  Tyson A. Clark,et al.  Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing , 2015, Nucleic acids research.

[23]  Eliseos J. Mucaki,et al.  FANCM c.5791C>T nonsense mutation (rs144567652) induces exon skipping, affects DNA repair activity and is a familial breast cancer risk factor. , 2015, Human molecular genetics.

[24]  B. Johansson,et al.  The emerging complexity of gene fusions in cancer , 2015, Nature Reviews Cancer.

[25]  E. Li,et al.  12-O-Tetradecanoylphorbol-13-Acetate Induces Up-Regulated Transcription of Variant 1 but Not Variant 2 of VIL2 in Esophageal Squamous Cell Carcinoma Cells via ERK1/2/AP-1/Sp1 Signaling , 2015, PloS one.

[26]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[27]  R. Verhaak,et al.  The landscape and therapeutic relevance of cancer-associated transcript fusions , 2014, Oncogene.

[28]  M. Landthaler,et al.  Roquin binding to target mRNAs involves a winged helix-turn-helix motif , 2014, Nature Communications.

[29]  Leena Salmela,et al.  LoRDEC: accurate and efficient long read error correction , 2014, Bioinform..

[30]  Donald Sharon,et al.  Defining a personal, allele-specific, and single-molecule long-read transcriptome , 2014, Proceedings of the National Academy of Sciences.

[31]  Yifeng Zhou,et al.  Increased levels of the long intergenic non-protein coding RNA POU3F3 promote DNA methylation in esophageal squamous cell carcinoma cells. , 2014, Gastroenterology.

[32]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[33]  Aimin Li,et al.  PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme , 2014, BMC Bioinformatics.

[34]  Wing Hung Wong,et al.  Characterization of the human ESC transcriptome by hybrid sequencing , 2013, Proceedings of the National Academy of Sciences.

[35]  J. Harrow,et al.  Assessment of transcript reconstruction methods for RNA-seq , 2013, Nature Methods.

[36]  Donald Sharon,et al.  A single-molecule long-read survey of the human transcriptome , 2013, Nature Biotechnology.

[37]  Jeffrey A. Engelman,et al.  Tyrosine kinase gene rearrangements in epithelial malignancies , 2013, Nature Reviews Cancer.

[38]  Hongen Zhang,et al.  RCircos: an R package for Circos 2D track plots , 2013, BMC Bioinformatics.

[39]  Yi Zhao,et al.  Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts , 2013, Nucleic acids research.

[40]  Y. Qiao,et al.  Epidemiology of Esophageal Cancer in Japan and China , 2013 .

[41]  Ming C. Hammond,et al.  Roquin Promotes Constitutive mRNA Decay via a Conserved Class of Stem-Loop Recognition Motifs , 2013, Cell.

[42]  J. Bartek,et al.  Expression of human BRCA1Δ17-19 alternative splicing variant with a truncated BRCT domain in MCF-7 cells results in impaired assembly of DNA repair complexes and aberrant DNA damage response. , 2013, Cellular signalling.

[43]  S. Markwell,et al.  AKR1B10 overexpression in breast cancer: Association with tumor size, lymph node metastasis and patient survival and its potential as a novel serum marker , 2012, International journal of cancer.

[44]  Howard Y. Chang,et al.  Genome regulation by long noncoding RNAs. , 2012, Annual review of biochemistry.

[45]  Jianzhen Xu,et al.  Chromatin‐modifying drugs induce miRNA‐153 expression to suppress Irs‐2 in glioblastoma cell lines , 2011, International journal of cancer.

[46]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[47]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[48]  G. Hu,et al.  The Impacts of ERCC1 Gene Exon VIII Alternative Splicing on Cisplatin-Resistance in Ovarian Cancer Cells , 2009, Cancer investigation.

[49]  Antoine M. van Oijen,et al.  Real-time single-molecule observation of rolling-circle DNA replication , 2009, Nucleic acids research.

[50]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[51]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[52]  Marcel H. Schulz,et al.  A Global View of Gene Activity and Alternative Splicing by Deep Sequencing of the Human Transcriptome , 2008, Science.

[53]  Angel R de Lera,et al.  Structural basis for the high all-trans-retinaldehyde reductase activity of the tumor marker AKR1B10 , 2007, Proceedings of the National Academy of Sciences.

[54]  Yong Zhang,et al.  CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine , 2007, Nucleic Acids Res..

[55]  Jianmin Wu,et al.  KOBAS server: a web-based platform for automated annotation and pathway identification , 2006, Nucleic Acids Res..

[56]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[57]  Hiroyuki Aburatani,et al.  Overexpression of the Aldo-Keto Reductase Family Protein AKR1B10 Is Highly Correlated with Smokers' Non–Small Cell Lung Carcinomas , 2005, Clinical Cancer Research.

[58]  T. Kuroki,et al.  Molecular and cellular features of esophageal cancer cells , 2005, Journal of Cancer Research and Clinical Oncology.

[59]  Zhong-ying Shen,et al.  The genetic events of HPV-immortalized esophageal epithelium cells. , 2001, International journal of molecular medicine.

[60]  Zhong-ying Shen,et al.  Study of immortalization and malignant transformation of human embryonic esophageal epithelial cells induced by HPV18 E6E7 , 2000, Journal of Cancer Research and Clinical Oncology.

[61]  M. Imamura,et al.  Characterization of 21 newly established esophageal cancer cell lines , 1992, Cancer.