SOAPfusion: a robust and effective computational fusion discovery tool for RNA-seq reads

MOTIVATION RNA-Seq provides a powerful approach to carry out ab initio investigation of fusion transcripts representing critical translocation and post-transcriptional events that recode hereditary information. Most of the existing computational fusion detection tools are challenged by the issues of accuracy and how to handle multiple mappings. RESULTS We present a novel tool SOAPfusion for fusion discovery with paired-end RNA-Seq reads. SOAPfusion is accurate and efficient for fusion discovery with high sensitivity (≥93%), low false-positive rate (≤1.36%), even the coverage is as low as 10×, highlighting its ability to detect fusions efficiently at low sequencing cost. From real data of Universal Human Reference RNA (UHRR) samples, SOAPfusion detected 7 novel fusion genes, more than other existing tools and all genes have been validated through reverse transcription-polymerase chain reaction followed by Sanger sequencing. SOAPfusion thus proves to be an effective method with precise applicability in search of fusion transcripts, which is advantageous to accelerate pathological and therapeutic cancer studies.

[1]  A. Børresen-Dale,et al.  Identification of fusion genes in breast cancer by paired-end RNA-sequencing , 2011, Genome Biology.

[2]  Xin Li,et al.  Short Homologous Sequences Are Strongly Associated with the Generation of Chimeric RNAs in Eukaryotes , 2008, Journal of Molecular Evolution.

[3]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[4]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[5]  W. Lam,et al.  Comprehensive copy number profiles of breast cancer cell model genomes , 2006, Breast Cancer Research.

[6]  H. Geidel,et al.  Smirnow, N. W., und J. W. Dunin-Barkowski: Mathematische Statistik in der Technik. VEB Deutscher Verlag der Wissenschaften, Berlin 1963; 431 Seiten. Preis MDN 32,— , 1965 .

[7]  N. W. Smirnow,et al.  Mathematische Statistik in der Technik ; kurzer Lehrgang , 1963 .

[8]  B. Johansson,et al.  The impact of translocations and gene fusions on cancer causation , 2007, Nature Reviews Cancer.

[9]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[10]  Lee T. Sam,et al.  Transcriptome Sequencing to Detect Gene Fusions in Cancer , 2009, Nature.

[11]  J. Sklar,et al.  A Neoplastic Gene Fusion Mimics Trans-Splicing of RNAs in Normal Human Cells , 2008, Science.

[12]  Xiong Su,et al.  TBC1D3, a hominoid oncoprotein, is encoded by a cluster of paralogues located on chromosome 17q12. , 2006, Genomics.

[13]  Toshiro Aigaki,et al.  Alternative trans‐splicing: a novel mode of pre‐mRNA processing , 2006, Biology of the cell.

[14]  Vineet Bafna,et al.  Sensitive gene fusion detection using ambiguously mapping RNA-Seq read pairs , 2011, Bioinform..

[15]  Siu-Ming Yiu,et al.  SOAPsplice: Genome-Wide ab initio Detection of Splice Junctions from RNA-Seq Data , 2011, Front. Gene..

[16]  Ruiqiang Li,et al.  SOAP: short oligonucleotide alignment program , 2008, Bioinform..

[17]  Fang Fang,et al.  FusionMap: detecting fusion genes from next-generation sequencing data at base-pair resolution , 2011, Bioinform..

[18]  Christopher A. Miller,et al.  A sequence-level map of chromosomal breakpoints in the MCF-7 breast cancer cell line yields insights into the evolution of a cancer genome. , 2009, Genome research.

[19]  T. Fennell,et al.  Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts , 2009, Genome Biology.

[20]  D. Geschwind,et al.  Degradation of tau protein by puromycin-sensitive aminopeptidase in vitro. , 2006, Biochemistry.

[21]  R. Sorek,et al.  Transcription-mediated gene fusion in the human genome. , 2005, Genome research.

[22]  S. Luo,et al.  Chimeric transcript discovery by paired-end transcriptome sequencing , 2009, Proceedings of the National Academy of Sciences.

[23]  Süleyman Cenk Sahinalp,et al.  deFuse: An Algorithm for Gene Fusion Discovery in Tumor RNA-Seq Data , 2011, PLoS Comput. Biol..

[24]  J. Maguire,et al.  Integrative analysis of the melanoma transcriptome. , 2010, Genome research.

[25]  O. Kallioniemi,et al.  Reanalysis of RNA-Sequencing Data Reveals Several Additional Fusion Genes with Multiple Isoforms , 2012, PloS one.

[26]  Benjamin J. Raphael,et al.  Decoding the fine-scale structure of a breast cancer genome and transcriptome. , 2006, Genome research.

[27]  D. Geschwind,et al.  A Genomic Screen for Modifiers of Tauopathy Identifies Puromycin-Sensitive Aminopeptidase as an Inhibitor of Tau-Induced Neurodegeneration , 2006, Neuron.

[28]  M. Baccarani,et al.  Hematologic and cytogenetic responses to imatinib mesylate in chronic myelogenous leukemia. , 2002, The New England journal of medicine.

[29]  Sanghyuk Lee,et al.  ChimerDB 2.0—a knowledgebase for fusion genes updated , 2009, Nucleic Acids Res..

[30]  Andreas D. Baxevanis,et al.  Bioinformatics - a practical guide to the analysis of genes and proteins , 2001, Methods of biochemical analysis.

[31]  D. Haber,et al.  The Tre2 (USP6) oncogene is a hominoid-specific gene , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Gabor T. Marth,et al.  Whole-genome sequencing and variant discovery in C. elegans , 2008, Nature Methods.

[33]  Siu-Ming Yiu,et al.  Compressed indexing and local alignment of DNA , 2008, Bioinform..

[34]  S. Salzberg,et al.  TopHat-Fusion: an algorithm for discovery of novel fusion transcripts , 2011, Genome Biology.

[35]  Derek Y. Chiang,et al.  MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery , 2010, Nucleic acids research.

[36]  A. Chinnaiyan,et al.  Recurrent gene fusions in prostate cancer , 2008, Nature Reviews Cancer.

[37]  M. Teixeira,et al.  Recurrent fusion oncogenes in carcinomas. , 2006, Critical reviews in oncogenesis.

[38]  David Z. Chen,et al.  METHOD Open Access , 2014 .

[39]  Jian Ma,et al.  FusionHunter: identifying fusion transcripts in cancer using paired-end RNA-seq , 2011, Bioinform..