ChiTaRS 2.1—an improved database of the chimeric transcripts and RNA-seq data with novel sense–antisense chimeric RNA transcripts

Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS 2.1 database of chimeric transcripts and RNA-Seq data (http://chitars.bioinfo.cnio.es/) is the second version of the ChiTaRS database and includes improvements in content and functionality. Chimeras from eight organisms have been collated including novel sense–antisense (SAS) chimeras resulting from the slippage of the sense and anti-sense intragenic regions. The new database version collects more than 29 000 chimeric transcripts and indicates the expression and tissue specificity for 333 entries confirmed by RNA-seq reads mapping the chimeric junction sites. User interface allows for rapid and easy analysis of evolutionary conservation of fusions, literature references and experimental data supporting fusions in different organisms. More than 1428 cancer breakpoints have been automatically collected from public databases and manually verified to identify their correct cross-references, genomic sequences and junction sites. As a result, the ChiTaRS 2.1 collection of chimeras from eight organisms and human cancer breakpoints extends our understanding of the evolution of chimeric transcripts in eukaryotes as well as their functional role in carcinogenic processes.

[1]  Alfonso Valencia,et al.  Novel domain combinations in proteins encoded by chimeric transcripts , 2012, Bioinform..

[2]  Lee T. Sam,et al.  Transcriptome Sequencing to Detect Gene Fusions in Cancer , 2009, Nature.

[3]  J. Sklar,et al.  A Neoplastic Gene Fusion Mimics Trans-Splicing of RNAs in Normal Human Cells , 2008, Science.

[4]  David B Goldstein,et al.  Screening the human exome: a comparison of whole genome and whole transcriptome sequencing , 2010, Genome Biology.

[5]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[6]  Yuki Togashi,et al.  Identification of novel isoforms of the EML4-ALK transforming gene in non-small cell lung cancer. , 2008, Cancer research.

[7]  Saverio Alberti,et al.  Detection and analysis of spliced chimeric mRNAs in sequence databanks. , 2003, Nucleic acids research.

[8]  H. Aburatani,et al.  Identification of the transforming EML4–ALK fusion gene in non-small-cell lung cancer , 2007, Nature.

[9]  Michele Magrane,et al.  UniProt Knowledgebase: a hub of integrated protein data , 2011, Database J. Biol. Databases Curation.

[10]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[11]  A. Reymond,et al.  Tandem chimerism as a means to increase protein complexity in the human genome. , 2005, Genome research.

[12]  Juliane C. Dohm,et al.  Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia , 2011, Nature.

[13]  Christopher J. Lee,et al.  A transcriptional sketch of a primary human breast cancer by 454 deep sequencing , 2009, BMC Genomics.

[14]  Krishna R. Kalari,et al.  Detection of redundant fusion transcripts as biomarkers or disease-specific therapeutic targets in breast cancer. , 2012, Cancer research.

[15]  Nadav S. Bar,et al.  Landscape of transcription in human cells , 2012, Nature.

[16]  Trees-Juen Chuang,et al.  Is an observed non-co-linear RNA product spliced in trans, in cis or just in vitro? , 2014, Nucleic acids research.

[17]  Gennifer E. Merrihew,et al.  Expediting the development of targeted SRM assays: using data from shotgun proteomics to automate method development. , 2009, Journal of proteome research.

[18]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[19]  G. Tocchini-Valentini,et al.  Cis- and trans-splicing of mRNAs mediated by tRNA sequences in eukaryotic cells , 2008, Proceedings of the National Academy of Sciences.

[20]  José L. Vizmanos,et al.  Signatures of Selection in Fusion Transcripts Resulting From Chromosomal Translocations in Human Cancer , 2009, PloS one.

[21]  Christian Blaschke,et al.  Text Mining for Metabolic Pathways, Signaling Cascades, and Protein Networks , 2005, Science's STKE.

[22]  B. Johansson,et al.  The impact of translocations and gene fusions on cancer causation , 2007, Nature Reviews Cancer.

[23]  Antony V. Cox,et al.  Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing , 2008, Nature Genetics.

[24]  Alfonso Valencia,et al.  iHOP web services , 2007, Nucleic Acids Res..

[25]  Mithat Gönen,et al.  Selection pressure exerted by imatinib therapy leads to disparate outcomes of imatinib discontinuation trials , 2012, Haematologica.

[26]  Alfonso Valencia,et al.  Implementing the iHOP concept for navigation of biomedical literature , 2005, ECCB/JBI.

[27]  A. Ciccodicola,et al.  Uncovering the Complexity of Transcriptomes with RNA-Seq , 2010, Journal of biomedicine & biotechnology.

[28]  Sanghyuk Lee,et al.  ChimerDB—a knowledgebase for fusion sequences , 2005, Nucleic Acids Res..

[29]  A. Valencia,et al.  A gene network for navigating the literature , 2004, Nature Genetics.

[30]  E. Giné,et al.  Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia , 2011, Nature Genetics.

[31]  Csaba Finta,et al.  Intergenic mRNA Molecules Resulting fromtrans-Splicing* , 2002, The Journal of Biological Chemistry.

[32]  Peter Marynen,et al.  A novel cryptic translocation t(12;17)(p13;p12–p13) in a secondary acute myeloid leukemia results in a fusion of the ETV6 gene and the antisense strand of the PER1 gene , 2003, Genes, chromosomes & cancer.

[33]  David Haussler,et al.  The UCSC Genome Browser database: 2014 update , 2013, Nucleic Acids Res..

[34]  B. Johansson,et al.  Prevalence estimates of recurrent balanced cytogenetic aberrations and gene fusions in unselected patients with neoplastic disorders , 2005, Genes, chromosomes & cancer.

[35]  Michel Eduardo Beleza Yamagishi,et al.  Detection of human interchromosomal trans-splicing in sequence databanks , 2010, Briefings Bioinform..

[36]  David Tollervey,et al.  Apparent Non-Canonical Trans-Splicing Is Generated by Reverse Transcriptase In Vitro , 2010, PloS one.

[37]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[38]  Sanghyuk Lee,et al.  ChimerDB 2.0—a knowledgebase for fusion genes updated , 2009, Nucleic Acids Res..

[39]  Philipp Kapranov,et al.  Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays. , 2005, Genome research.

[40]  J. Sklar,et al.  Gene fusions and RNA trans-splicing in normal and neoplastic human cells , 2009, Cell cycle.

[41]  Alfonso Valencia,et al.  ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data , 2012, Nucleic Acids Res..

[42]  Gilbert S Omenn,et al.  An integrative approach to reveal driver gene fusions from paired-end sequencing data in cancer , 2009, Nature Biotechnology.

[43]  R. Sorek,et al.  Transcription-mediated gene fusion in the human genome. , 2005, Genome research.

[44]  S. Luo,et al.  Chimeric transcript discovery by paired-end transcriptome sequencing , 2009, Proceedings of the National Academy of Sciences.

[45]  Wei Li,et al.  Recurrent chimeric RNAs enriched in human prostate cancer identified by deep sequencing , 2011, Proceedings of the National Academy of Sciences.

[46]  Jonathan M. Mudge,et al.  Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells , 2012, PloS one.

[47]  A. Børresen-Dale,et al.  Identification of fusion genes in breast cancer by paired-end RNA-sequencing , 2011, Genome Biology.

[48]  Wei Zhou,et al.  Characterization of the Yeast Transcriptome , 1997, Cell.

[49]  Xin Li,et al.  Short Homologous Sequences Are Strongly Associated with the Generation of Chimeric RNAs in Eukaryotes , 2008, Journal of Molecular Evolution.

[50]  Christopher A. Maher,et al.  ChimeraScan: a tool for identifying chimeric transcription in sequencing data , 2011, Bioinform..

[51]  Charlotte N. Henrichsen,et al.  Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions. , 2007, Genome research.

[52]  T. Gingeras Implications of chimaeric non-co-linear transcripts , 2009, Nature.

[53]  F. J. Novo,et al.  TICdb: a collection of gene-mapped translocation breakpoints in cancer , 2007, BMC Genomics.

[54]  E. Birney,et al.  EGASP: the human ENCODE Genome Annotation Assessment Project , 2006, Genome Biology.

[55]  Trees-Juen Chuang,et al.  Integrative transcriptome sequencing identifies trans-splicing events with important roles in human embryonic stem cell pluripotency , 2014, Genome research.

[56]  L. Hood,et al.  Complementary Profiling of Gene Expression at the Transcriptome and Proteome Levels in Saccharomyces cerevisiae*S , 2002, Molecular & Cellular Proteomics.

[57]  C Joel McManus,et al.  Global analysis of trans-splicing in Drosophila , 2010, Proceedings of the National Academy of Sciences.

[58]  M. Tress,et al.  Chimeras taking shape: Potential functions of proteins encoded by chimeric RNA transcripts , 2012, Genome research.

[59]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[60]  Ying Wang,et al.  dbCRID: a database of chromosomal rearrangements in human diseases , 2010, Nucleic Acids Res..