SplicePie: a novel analytical approach for the detection of alternative, non-sequential and recursive splicing

Alternative splicing is a powerful mechanism present in eukaryotic cells to obtain a wide range of transcripts and protein isoforms from a relatively small number of genes. The mechanisms regulating (alternative) splicing and the paradigm of consecutive splicing have recently been challenged, especially for genes with a large number of introns. RNA-Seq, a powerful technology using deep sequencing in order to determine transcript structure and expression levels, is usually performed on mature mRNA, therefore not allowing detailed analysis of splicing progression. Sequencing pre-mRNA at different stages of splicing potentially provides insight into mRNA maturation. Although the number of tools that analyze total and cytoplasmic RNA in order to elucidate the transcriptome composition is rapidly growing, there are no tools specifically designed for the analysis of nuclear RNA (which contains mixtures of pre- and mature mRNA). We developed dedicated algorithms to investigate the splicing process. In this paper, we present a new classification of RNA-Seq reads based on three major stages of splicing: pre-, intermediate- and post-splicing. Applying this novel classification we demonstrate the possibility to analyze the order of splicing. Furthermore, we uncover the potential to investigate the multi-step nature of splicing, assessing various types of recursive splicing events. We provide the data that gives biological insight into the order of splicing, show that non-sequential splicing of certain introns is reproducible and coinciding in multiple cell lines. We validated our observations with independent experimental technologies and showed the reliability of our method. The pipeline, named SplicePie, is freely available at: https://github.com/pulyakhina/splicing_analysis_pipeline. The example data can be found at: https://barmsijs.lumc.nl/HG/irina/example_data.tar.gz.

[1]  D. Bentley Coupling mRNA processing with transcription in time and space , 2014, Nature Reviews Genetics.

[2]  P. Hart,et al.  Regiospecific solid-phase synthesis of branched oligoribonucleotides that mimic intronic lariat RNA intermediates. , 2014, The Journal of organic chemistry.

[3]  Hitoshi Suzuki,et al.  Nested introns in an intron: Evidence of multi‐step splicing in a large intron of the human dystrophin pre‐mRNA , 2013, FEBS letters.

[4]  David G Hendrickson,et al.  Differential analysis of gene regulation at transcript resolution with RNA-seq , 2012, Nature Biotechnology.

[5]  David G. Knowles,et al.  Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs , 2012, Genome research.

[6]  Kai Ye,et al.  PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data , 2012, Bioinform..

[7]  M. Rosbash,et al.  Nascent-seq indicates widespread cotranscriptional pre-mRNA splicing in Drosophila. , 2011, Genes & development.

[8]  Sanjay Tyagi,et al.  Single-Molecule Imaging of Transcriptionally Coupled and Uncoupled Splicing , 2011, Cell.

[9]  J. Derisi,et al.  HMMSplicer: A Tool for Efficient and Sensitive Discovery of Known and Novel Splice Junctions in RNA-Seq Data , 2010, PloS one.

[10]  Daniel Rios,et al.  Ensembl 2011 , 2010, Nucleic Acids Res..

[11]  Eric T. Wang,et al.  Analysis and design of RNA sequencing experiments for identifying isoform regulation , 2010, Nature Methods.

[12]  Derek Y. Chiang,et al.  MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery , 2010, Nucleic acids research.

[13]  J. Rinn,et al.  Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs , 2010, Nature Biotechnology.

[14]  J. Rinn,et al.  Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs , 2010, Nature biotechnology.

[15]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[16]  D. Black,et al.  Co-transcriptional splicing of constitutive and alternative exons. , 2009, RNA.

[17]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[18]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[19]  E. Liu,et al.  Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. , 2009, Genome research.

[20]  A. Moorman,et al.  Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data , 2009, Nucleic acids research.

[21]  X. Meng,et al.  Identification of a porcine DC-SIGN-related C-type lectin, porcine CLEC4G (LSECtin), and its order of intron removal during splicing: Comparative genomic analyses of the cluster of genes CD23/CLEC4G/DC-SIGN among mammalian species☆ , 2009, Developmental & Comparative Immunology.

[22]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[23]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[24]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[25]  D. Baralle,et al.  Splicing in action: assessing disease causing sequence changes , 2005, Journal of Medical Genetics.

[26]  Marc A. Schaub,et al.  Subdivision of Large Introns in Drosophila by Recursive Splicing at Nonexonic Elements , 2005, Genetics.

[27]  J. Castle,et al.  Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays , 2003, Science.

[28]  P. Byers,et al.  Redefinition of exon 7 in the COL1A1 gene of type I collagen by an intron 8 splice-donor-site mutation in a form of osteogenesis imperfecta: influence of intron splice order on outcome of splice-site mutation. , 1999, American journal of human genetics.

[29]  A. J. Lopez,et al.  Generation of alternative Ultrabithorax isoforms and stepwise removal of a large intron by resplicing at exon-exon junctions. , 1998, Molecular cell.

[30]  U. Schwarze,et al.  Splicing defects in the COL3A1 gene: marked preference for 5' (donor) spice-site mutations in patients with exon-skipping mutations and Ehlers-Danlos syndrome type IV. , 1997, American journal of human genetics.

[31]  E. Ullu,et al.  Temporal order of RNA-processing reactions in trypanosomes , 1993 .

[32]  Pamela Knight Neurobiotech: Issues, No Answers , 1989, Nature Biotechnology.

[33]  F. Baralle,et al.  A role for exon sequences in alternative splicing of the human fibronectin gene. , 1987, Nucleic acids research.

[34]  R. Spritz,et al.  In vitro splicing pathways of pre-mRNAs containing multiple intervening sequences? , 1987, Molecular and Cellular Biology.

[35]  U. Pettersson,et al.  Splicing of adenovirus 2 early region 1A mRNAs is non-sequential. , 1983, Journal of molecular biology.

[36]  Henry D. Priest,et al.  Genome-wide mapping of alternative splicing in Arabidopsis thaliana. , 2010, Genome research.

[37]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[38]  S. Bennett Solexa Ltd. , 2004, Pharmacogenomics.