ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data

Chimeric RNAs that comprise two or more different transcripts have been identified in many cancers and among the Expressed Sequence Tags (ESTs) isolated from different organisms; they might represent functional proteins and produce different disease phenotypes. The ChiTaRS database of Chimeric Transcripts and RNA-Sequencing data (http://chitars.bioinfo.cnio.es/) collects more than 16 000 chimeric RNAs from humans, mice and fruit flies, 233 chimeras confirmed by RNA-seq reads and ∼2000 cancer breakpoints. The database indicates the expression and tissue specificity of these chimeras, as confirmed by RNA-seq data, and it includes mass spectrometry results for some human entries at their junctions. Moreover, the database has advanced features to analyze junction consistency and to rank chimeras based on the evidence of repeated junction sites. Finally, ‘Junction Search’ screens through the RNA-seq reads found at the chimeras’ junction sites to identify putative junctions in novel sequences entered by users. Thus, ChiTaRS is an extensive catalog of human, mouse and fruit fly chimeras that will extend our understanding of the evolution of chimeric transcripts in eukaryotes and can be advantageous in the analysis of human cancer breakpoints.

[1]  Csaba Finta,et al.  Intergenic mRNA Molecules Resulting fromtrans-Splicing* , 2002, The Journal of Biological Chemistry.

[2]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[3]  M. Tress,et al.  Chimeras taking shape: Potential functions of proteins encoded by chimeric RNA transcripts , 2012, Genome research.

[4]  Christopher J. Lee,et al.  A transcriptional sketch of a primary human breast cancer by 454 deep sequencing , 2009, BMC Genomics.

[5]  Michel Eduardo Beleza Yamagishi,et al.  Detection of human interchromosomal trans-splicing in sequence databanks , 2010, Briefings Bioinform..

[6]  J. Sklar,et al.  Gene fusions and RNA trans-splicing in normal and neoplastic human cells , 2009, Cell cycle.

[7]  José L. Vizmanos,et al.  Signatures of Selection in Fusion Transcripts Resulting From Chromosomal Translocations in Human Cancer , 2009, PloS one.

[8]  Gilbert S Omenn,et al.  An integrative approach to reveal driver gene fusions from paired-end sequencing data in cancer , 2009, Nature Biotechnology.

[9]  J. Kril,et al.  Understanding the pathogenesis of Alzheimer’s disease: will RNA‐Seq realize the promise of transcriptomics? , 2011, Journal of neurochemistry.

[10]  A. Børresen-Dale,et al.  Identification of fusion genes in breast cancer by paired-end RNA-sequencing , 2011, Genome Biology.

[11]  Wei Zhou,et al.  Characterization of the Yeast Transcriptome , 1997, Cell.

[12]  Saverio Alberti,et al.  Detection and analysis of spliced chimeric mRNAs in sequence databanks. , 2003, Nucleic acids research.

[13]  Xin Li,et al.  Short Homologous Sequences Are Strongly Associated with the Generation of Chimeric RNAs in Eukaryotes , 2008, Journal of Molecular Evolution.

[14]  J. Reis-Filho,et al.  An introduction to genes, genomes and disease , 2010, The Journal of pathology.

[15]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[16]  Steven J. M. Jones,et al.  De novo assembly and analysis of RNA-seq data , 2010, Nature Methods.

[17]  Sanghyuk Lee,et al.  ChimerDB 2.0—a knowledgebase for fusion genes updated , 2009, Nucleic Acids Res..

[18]  G. Tocchini-Valentini,et al.  Cis- and trans-splicing of mRNAs mediated by tRNA sequences in eukaryotic cells , 2008, Proceedings of the National Academy of Sciences.

[19]  Mary Goldman,et al.  The UCSC Genome Browser database: extensions and updates 2011 , 2011, Nucleic Acids Res..

[20]  Krishna R. Kalari,et al.  Detection of redundant fusion transcripts as biomarkers or disease-specific therapeutic targets in breast cancer. , 2012, Cancer research.

[21]  Alfonso Valencia,et al.  Novel domain combinations in proteins encoded by chimeric transcripts , 2012, Bioinform..

[22]  Lee T. Sam,et al.  Transcriptome Sequencing to Detect Gene Fusions in Cancer , 2009, Nature.

[23]  J. Sklar,et al.  A Neoplastic Gene Fusion Mimics Trans-Splicing of RNAs in Normal Human Cells , 2008, Science.

[24]  B. Johansson,et al.  The impact of translocations and gene fusions on cancer causation , 2007, Nature Reviews Cancer.

[25]  Sanghyuk Lee,et al.  ChimerDB—a knowledgebase for fusion sequences , 2005, Nucleic Acids Res..

[26]  David B Goldstein,et al.  Screening the human exome: a comparison of whole genome and whole transcriptome sequencing , 2010, Genome Biology.

[27]  R. Sorek,et al.  Transcription-mediated gene fusion in the human genome. , 2005, Genome research.

[28]  S. Luo,et al.  Chimeric transcript discovery by paired-end transcriptome sequencing , 2009, Proceedings of the National Academy of Sciences.

[29]  A. Ben-Hur,et al.  METHOD Open Access , 2014 .

[30]  Wei Li,et al.  Recurrent chimeric RNAs enriched in human prostate cancer identified by deep sequencing , 2011, Proceedings of the National Academy of Sciences.

[31]  Nadav S. Bar,et al.  Landscape of transcription in human cells , 2012, Nature.

[32]  Yuki Togashi,et al.  Identification of novel isoforms of the EML4-ALK transforming gene in non-small cell lung cancer. , 2008, Cancer research.

[33]  Juliane C. Dohm,et al.  Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia , 2011, Nature.

[34]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[35]  Chris Sander,et al.  CancerGenes: a gene selection resource for cancer genome projects , 2006, Nucleic Acids Res..

[36]  Antony V. Cox,et al.  Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing , 2008, Nature Genetics.

[37]  B. Ponder,et al.  Does massively parallel transcriptome analysis signify the end of cancer histopathology as we know it? , 2000, Genome Biology.

[38]  T. Gingeras Implications of chimaeric non-co-linear transcripts , 2009, Nature.

[39]  F. J. Novo,et al.  TICdb: a collection of gene-mapped translocation breakpoints in cancer , 2007, BMC Genomics.

[40]  E. Birney,et al.  EGASP: the human ENCODE Genome Annotation Assessment Project , 2006, Genome Biology.

[41]  L. Hood,et al.  Complementary Profiling of Gene Expression at the Transcriptome and Proteome Levels in Saccharomyces cerevisiae*S , 2002, Molecular & Cellular Proteomics.

[42]  Mingming Jia,et al.  COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer , 2010, Nucleic Acids Res..

[43]  E. Giné,et al.  Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia , 2011, Nature Genetics.

[44]  Philipp Kapranov,et al.  Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays. , 2005, Genome research.

[45]  Heui-Soo Kim,et al.  HYBRIDdb: a database of hybrid genes in the human genome , 2007, BMC Genomics.

[46]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[47]  A. Ciccodicola,et al.  Uncovering the Complexity of Transcriptomes with RNA-Seq , 2010, Journal of biomedicine & biotechnology.

[48]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[49]  Mithat Gönen,et al.  Selection pressure exerted by imatinib therapy leads to disparate outcomes of imatinib discontinuation trials , 2012, Haematologica.

[50]  B. Johansson,et al.  Prevalence estimates of recurrent balanced cytogenetic aberrations and gene fusions in unselected patients with neoplastic disorders , 2005, Genes, chromosomes & cancer.

[51]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[52]  David Tollervey,et al.  Apparent Non-Canonical Trans-Splicing Is Generated by Reverse Transcriptase In Vitro , 2010, PloS one.

[53]  M. Stratton,et al.  The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website , 2004, British Journal of Cancer.

[54]  H. Aburatani,et al.  Identification of the transforming EML4–ALK fusion gene in non-small-cell lung cancer , 2007, Nature.

[55]  A. Reymond,et al.  Tandem chimerism as a means to increase protein complexity in the human genome. , 2005, Genome research.

[56]  Byungkook Lee,et al.  Finding fusion genes resulting from chromosome rearrangement by analyzing the expressed sequence databases. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[57]  Ying Wang,et al.  dbCRID: a database of chromosomal rearrangements in human diseases , 2010, Nucleic Acids Res..

[58]  M. Kimmel,et al.  Conflict of interest statement. None declared. , 2010 .

[59]  Ingo Roeder,et al.  Dynamic modeling of imatinib-treated chronic myeloid leukemia: functional insights and clinical implications , 2006, Nature Medicine.

[60]  Jonathan M. Mudge,et al.  Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells , 2012, PloS one.