Detection of alternative isoforms of gene fusions from long-read RNA-seq with FLAIR-fusion

Gene fusions are important cancer drivers and drug targets, but are difficult to reliably identify with short-read RNA-sequencing. Long-read RNA sequencing data are more likely to span a fusion breakpoint and provide more sequence context around the breakpoint. This allows for more reliable identification of gene fusions and for detecting alternative splicing in gene fusions. Notably, alternative splicing of fusions has been shown to be a mechanism for drug resistance and altered levels of oncogenicity. Here, we present FLAIR-fusion, a computational tool to identify gene fusions and their isoforms from long-read RNA-sequencing data. FLAIR-fusion can detect fusions and their isoforms with high precision and recall, even with error-prone reads. We also investigated different library preparation methods and found that direct-cDNA has a higher incidence of artifactual chimeras than direct-RNA and PCR-cDNA methods. FLAIR-fusion is able to filter these technical artifacts from all of these library prep methods and consistently identify known fusions and their isoforms across cell lines. We ran FLAIR-fusion on amplicon sequencing from multiple tumor samples and cell lines and detected alternative splicing in the previously validated fusion GUCYA2-PIWIL4, which shows that long-read sequencing can detect novel splicing events from cancer gene panels. We also detect fusion isoforms from long-read sequencing in chronic lymphocytic leukemias with the splicing factor mutation SF3B1 K700E, and find that up to 10% of gene fusions had more than one unique isoform. We also compared long-read fusion detection tools with short-read fusion detection tools on the same samples and found greater consensus in the long-read tools. Our results demonstrate that gene fusion isoforms can be effectively detected from long-read RNA-sequencing and are important in the characterization of the full complexity of cancer transcriptomes.

[1]  Hagen U. Tilgner,et al.  Accurate isoform discovery with IsoQuant using long reads , 2023, Nature Biotechnology.

[2]  Lan Lin,et al.  ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data , 2023, Science advances.

[3]  M. Love,et al.  Context-aware transcript quantification from long-read RNA-seq data with Bambu , 2022, bioRxiv.

[4]  Zechen Chong,et al.  Gene Fusion Detection and Characterization in Long-Read Cancer Transcriptome Sequencing Data with FusionSeeker , 2022, Cancer research.

[5]  A. Byrne,et al.  Identifying and quantifying isoforms from accurate full-length transcriptome sequencing reads with Mandalorion , 2022, bioRxiv.

[6]  Faraz Hach,et al.  Genion, an accurate tool to detect gene fusion from long transcriptomics reads , 2022, BMC genomics.

[7]  A. Oshlack,et al.  JAFFAL: detecting fusion genes with long-read transcriptome sequencing , 2022 .

[8]  Michael M. Khayat,et al.  Hidden biases in germline structural variant detection , 2021, Genome biology.

[9]  A. Oshlack,et al.  JAFFAL: detecting fusion genes with long-read transcriptome sequencing , 2021, Genome Biology.

[10]  Philip A. Ewels,et al.  A systematic benchmark of Nanopore long read RNA sequencing for transcript level analysis in human cell lines , 2021, bioRxiv.

[11]  J. Qian,et al.  Alternative splicing and cancer: a systematic review , 2021, Signal Transduction and Targeted Therapy.

[12]  N. Akimitsu,et al.  Fusion Genes and RNAs in Cancer Development , 2021, Non-coding RNA.

[13]  S. Fröhling,et al.  Accurate and efficient detection of gene fusions from RNA sequencing data , 2021, Genome research.

[14]  James C. Wright,et al.  GENCODE 2021 , 2020, Nucleic Acids Res..

[15]  Jiang F Zhong,et al.  LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing , 2020, BMC Genomics.

[16]  J. Downing,et al.  CICERO: a versatile method for detecting complex and diverse driver fusions using cancer RNA sequencing data , 2020, Genome Biology.

[17]  Nuno A. Fonseca,et al.  Patterns of somatic structural variation in human cancer genomes , 2020, Nature.

[18]  Chittibabu Guda,et al.  Pan-Cancer Analysis Reveals the Diverse Landscape of Novel Sense and Antisense Fusion Transcripts , 2020, Molecular therapy. Nucleic acids.

[19]  Marcel H. Schulz,et al.  AERON: Transcript quantification and gene-fusion detection using long reads , 2020, bioRxiv.

[20]  Ash A. Alizadeh,et al.  Functional significance of U2AF1 S34F mutations in lung adenocarcinomas , 2019, Nature Communications.

[21]  B. Haas,et al.  Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods , 2019, Genome Biology.

[22]  N. Caplen,et al.  Fusion transcripts: Unexploited vulnerabilities in cancer? , 2019, Wiley interdisciplinary reviews. RNA.

[23]  Leon Di Stefano,et al.  Comprehensive evaluation and characterisation of short read general-purpose structural variant calling software , 2019, Nature Communications.

[24]  Geo Pertea,et al.  Transcriptome assembly from long-read RNA-seq alignments with StringTie2 , 2019, Genome Biology.

[25]  Barbara J. Wold,et al.  A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification , 2019, bioRxiv.

[26]  G. Han,et al.  FusionPro, a Versatile Proteogenomic Tool for Identification of Novel Fusion Transcripts and Their Potential Translation Products in Cancer Cells* , 2019, Molecular & Cellular Proteomics.

[27]  Ryan R. Wick,et al.  Badread: simulation of error-prone long reads , 2019, J. Open Source Softw..

[28]  David M. Thomas,et al.  Diagnosis of fusion genes using targeted RNA sequencing , 2019, Nature Communications.

[29]  M. Robinson,et al.  A comprehensive examination of Nanopore native RNA sequencing for characterization of complex transcriptomes , 2019, Nature Communications.

[30]  Sergey Koren,et al.  De novo assembly of haplotype-resolved genomes with trio binning , 2018, Nature Biotechnology.

[31]  Angela N. Brooks,et al.  Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns , 2018, Nature Communications.

[32]  J. Gu,et al.  New rapid method to detect BCR-ABL fusion genes with multiplex RT-qPCR in one-tube at a time. , 2018, Leukemia research.

[33]  Li Ding,et al.  Driver Fusions and Their Implications in the Development and Treatment of Human Cancers , 2018, Cell reports.

[34]  David A. Eccles,et al.  Investigation of chimeric reads using the MinION , 2017, F1000Research.

[35]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[36]  Michael C. Schatz,et al.  Accurate detection of complex structural variations using single molecule sequencing , 2017, Nature Methods.

[37]  Sumio Sugano,et al.  Sequencing and phasing cancer mutations in lung cancers using a long-read portable sequencer , 2017, DNA research : an international journal for rapid publication of reports on genes and genomes.

[38]  Edwin Cuppen,et al.  Mapping and phasing of structural variation in patient genomes using nanopore sequencing , 2017, Nature Communications.

[39]  Timothy L. Tickle,et al.  STAR-Fusion: Fast and Accurate Fusion Transcript Detection from RNA-Seq , 2017, bioRxiv.

[40]  B. Langmead,et al.  Human splicing diversity and the extent of unannotated splice junctions across human RNA-seq samples on the Sequence Read Archive , 2016, Genome Biology.

[41]  Angie Duy Vo,et al.  Identifying fusion transcripts using next generation sequencing , 2016, Wiley interdisciplinary reviews. RNA.

[42]  M. Babu,et al.  Discovering and understanding oncogenic gene fusions through data intensive computational approaches , 2016, Nucleic acids research.

[43]  Na Liu,et al.  The Role of PIWIL4, an Argonaute Family Protein, in Breast Cancer* , 2016, The Journal of Biological Chemistry.

[44]  R. Fulton,et al.  INTEGRATE: gene fusion discovery using whole genome and transcriptome data , 2016, Genome research.

[45]  Tyson A. Clark,et al.  Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing , 2015, Nucleic acids research.

[46]  G. Tseng,et al.  Discovery and Classification of Fusion Transcripts in Prostate Cancer and Normal Prostate Tissue. , 2015, The American journal of pathology.

[47]  A. Oshlack,et al.  JAFFA: High sensitivity transcriptome-focused fusion gene detection , 2015, Genome Medicine.

[48]  Brian S. Roberts,et al.  Recurrent read-through fusion transcripts in breast cancer , 2014, Breast Cancer Research and Treatment.

[49]  G. Perrone,et al.  Alternative BCR/ABL splice variants in Philadelphia chromosome-positive leukemias result in novel tumor-specific fusion proteins that may represent potential targets for immunotherapy approaches. , 2007, Cancer research.

[50]  O. Witte,et al.  The BCR-ABL story: bench to bedside and back. , 2004, Annual review of immunology.

[51]  D. Gershon From bench to bedside and back? , 2001, Nature.

[52]  Jürg Zimmermann,et al.  Effects of a selective inhibitor of the Abl tyrosine kinase on the growth of Bcr–Abl positive cells , 1996, Nature Medicine.

[53]  G. Wang,et al.  The frequency of chimeric molecules as a consequence of PCR co-amplification of 16S rRNA genes from different bacterial species. , 1996, Microbiology.

[54]  C. Bloomfield,et al.  Clinical significance of the BCR-ABL fusion gene in adult acute lymphoblastic leukemia: a Cancer and Leukemia Group B Study (8762). , 1992, Blood.

[55]  Jonathan M. Mudge,et al.  Systematic assessment of long-read RNA-seq methods for transcript identi cation and quanti cation , 2021 .