Identification of genome-wide non-canonical spliced regions and analysis of biological functions for spliced sequences using Read-Split-Fly

BackgroundIt is generally thought that most canonical or non-canonical splicing events involving U2- and U12 spliceosomes occur within nuclear pre-mRNAs. However, the question of whether at least some U12-type splicing occurs in the cytoplasm is still unclear. In recent years next-generation sequencing technologies have revolutionized the field. The “Read-Split-Walk” (RSW) and “Read-Split-Run” (RSR) methods were developed to identify genome-wide non-canonical spliced regions including special events occurring in cytoplasm. As the significant amount of genome/transcriptome data such as, Encyclopedia of DNA Elements (ENCODE) project, have been generated, we have advanced a newer more memory-efficient version of the algorithm, “Read-Split-Fly” (RSF), which can detect non-canonical spliced regions with higher sensitivity and improved speed. The RSF algorithm also outputs the spliced sequences for further downstream biological function analysis.ResultsWe used open access ENCODE project RNA-Seq data to search spliced intron sequences against the U12-type spliced intron sequence database to examine whether some events could occur as potential signatures of U12-type splicing. The check was performed by searching spliced sequences against 5’ss and 3’ss sequences from the well-known orthologous U12-type spliceosomal intron database U12DB. Preliminary results of searching 70 ENCODE samples indicated that the presence of 5’ss with U12-type signature is more frequent than U2-type and prevalent in non-canonical junctions reported by RSF. The selected spliced sequences have also been further studied using miRBase to elucidate their functionality. Preliminary results from 70 samples of ENCODE datasets show that several miRNAs are prevalent in studied ENCODE samples. Two of these are associated with many diseases as suggested in the literature. Specifically, hsa-miR-1273 and hsa-miR-548 are associated with many diseases and cancers.ConclusionsOur RSF pipeline is able to detect many possible junctions (especially those with a high RPKM) with very high overall accuracy and relative high accuracy for novel junctions. We have incorporated useful parameter features into the pipeline such as, handling variable-length read data, and searching spliced sequences for splicing signatures and miRNA events. We suggest RSF, a tool for identifying novel splicing events, is applicable to study a range of diseases across biological systems under different experimental conditions.

[1]  C. Will,et al.  The U11/U12 snRNP 65K protein acts as a molecular bridge, binding the U12 snRNA and U11‐59K protein , 2005, The EMBO journal.

[2]  P. Greengard,et al.  Cerebellar neurodegeneration in the absence of microRNAs , 2007, The Journal of experimental medicine.

[3]  G. Glinsky An SNP-guided microRNA map of fifteen common human disorders identifies a consensus disease phenocode aiming at principal components of the nuclear import pathway , 2008, Cell cycle.

[4]  Gaofeng Wang,et al.  Variation in the miRNA-433 binding site of FGF20 confers risk for Parkinson disease by overexpression of alpha-synuclein. , 2008, American journal of human genetics.

[5]  Lizhong Ding,et al.  Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data , 2016, BMC Genomics.

[6]  P. Tam Faculty Opinions recommendation of miR-145 and miR-143 regulate smooth muscle cell fate and plasticity. , 2009 .

[7]  W. Filipowicz,et al.  The widespread regulation of microRNA biogenesis, function and decay , 2010, Nature Reviews Genetics.

[8]  S. Salzberg,et al.  The Transcriptional Landscape of the Mammalian Genome , 2005, Science.

[9]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[10]  Anton J. Enright,et al.  An ENU-induced mutation of miR-96 associated with progressive hearing loss in mice , 2009, Nature Genetics.

[11]  Y. Ruan,et al.  Genome-wide analysis reveals methyl-CpG–binding protein 2–dependent regulation of microRNAs in a mouse model of Rett syndrome , 2010, Proceedings of the National Academy of Sciences.

[12]  L. Pérez-Jurado,et al.  Defective minor spliceosome mRNA processing results in isolated familial growth hormone deficiency , 2014, EMBO molecular medicine.

[13]  Benjamin Lewin,et al.  Lewin's Essential Genes , 2009 .

[14]  D. Greco,et al.  Global analysis of the nuclear processing of transcripts with unspliced U12-type introns by the exosome , 2014, Nucleic acids research.

[15]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[16]  Praveen Sethupathy,et al.  Human microRNA-155 on chromosome 21 differentially interacts with its polymorphic target in the AGTR1 3' untranslated region: a mechanism for functional single-nucleotide polymorphisms related to phenotypes. , 2007, American journal of human genetics.

[17]  Ana Kozomara,et al.  miRBase: integrating microRNA annotation and deep-sequencing data , 2010, Nucleic Acids Res..

[18]  F. Slack,et al.  Oncomirs — microRNAs with a role in cancer , 2006, Nature Reviews Cancer.

[19]  John McAnally,et al.  MicroRNA-206 Delays ALS Progression and Promotes Regeneration of Neuromuscular Synapses in Mice , 2009, Science.

[20]  Lin He,et al.  MicroRNAs: small RNAs with a big role in gene regulation , 2004, Nature Reviews Genetics.

[21]  Stijn van Dongen,et al.  miRBase: microRNA sequences, targets and gene nomenclature , 2005, Nucleic Acids Res..

[22]  Wei-Lun Chen,et al.  SCA8 mRNA expression suggests an antisense regulation of KLHL1 and correlates to SCA8 pathology , 2008, Brain Research.

[23]  Ana Kozomara,et al.  miRBase: annotating high confidence microRNAs using deep sequencing data , 2013, Nucleic Acids Res..

[24]  M. Civelek,et al.  MicroRNA-10a regulation of proinflammatory phenotype in athero-susceptible endothelium in vivo and in vitro , 2010, Proceedings of the National Academy of Sciences.

[25]  G. Hannon,et al.  A MicroRNA Feedback Circuit in Midbrain Dopamine Neurons , 2007, Science.

[26]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[27]  R. Durbin,et al.  A computational scan for U12-dependent introns in the human genome sequence. , 2001, Nucleic acids research.

[28]  N. Sokol,et al.  Pathogenic LRRK2 negatively regulates microRNA-mediated translational repression , 2010, Nature.

[29]  P. Sharp,et al.  Evolutionary fates and origins of U12-type introns. , 1998, Molecular cell.

[30]  Ryan D. Morin,et al.  Identification of miR-145 and miR-146a as mediators of the 5q– syndrome phenotype , 2010, Nature Medicine.

[31]  Michael T. McManus,et al.  Dicer ablation in oligodendrocytes provokes neuronal impairment in mice , 2009, Annals of neurology.

[32]  Chaoqian Xu,et al.  The muscle-specific microRNA miR-1 regulates cardiac arrhythmogenic potential by targeting GJA1 and KCNJ2 , 2011, Nature Medicine.

[33]  Lin He,et al.  MicroRNAs: small RNAs with a big role in gene regulation , 2004, Nature reviews genetics.

[34]  E. Makeyev,et al.  Regulation of gene expression in mammalian nervous system through alternative pre-mRNA splicing coupled with RNA quality control mechanisms , 2013, Molecular and Cellular Neuroscience.

[35]  Sam Griffiths-Jones,et al.  miRBase: the microRNA sequence database. , 2006, Methods in molecular biology.

[36]  M. D'Esposito,et al.  Epigenetic alteration of microRNAs in DNMT3B-mutated patients of ICF syndrome , 2010, Epigenetics.

[37]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[38]  Tyler S. Alioto,et al.  U12DB: a database of orthologous U12-type spliceosomal introns , 2006, Nucleic Acids Res..

[39]  Luc Buée,et al.  Genetic ablation of Dicer in adult forebrain neurons results in abnormal tau hyperphosphorylation and neurodegeneration. , 2010, Human molecular genetics.

[40]  Larisa M Haupt,et al.  Review: Alternative Splicing (AS) of Genes As An Approach for Generating Protein Complexity , 2013, Current genomics.

[41]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[42]  R. Sachidanandam,et al.  Comprehensive splice-site analysis using comparative genomics , 2006, Nucleic acids research.

[43]  M. Boguski,et al.  dbEST — database for “expressed sequence tags” , 1993, Nature Genetics.

[44]  J. Mendell MicroRNAs: Critical Regulators of Development, Cellular Physiology and Malignancy , 2005, Cell cycle.

[45]  S. Griffiths-Jones,et al.  miRBase: microRNA Sequences and Annotation , 2010, Current protocols in bioinformatics.

[46]  Yongsheng Bai,et al.  Novel Bioinformatics Method for Identification of Genome-Wide Non-Canonical Spliced Regions Using RNA-Seq Data , 2014, PloS one.

[47]  C. Tabin,et al.  miRNA malfunction causes spinal motor neuron disease , 2010, Proceedings of the National Academy of Sciences.

[48]  R. Padgett New connections between splicing and human disease. , 2012, Trends in genetics : TIG.

[49]  C. Will,et al.  An ancient mechanism for splicing control: U11 snRNP as an activator of alternative splicing. , 2010, Molecular cell.

[50]  Matthew D Healy Using BLAST for Performing Sequence Alignment , 2007, Current protocols in human genetics.

[51]  Y. Suárez,et al.  MicroRNAs Are Necessary for Vascular Smooth Muscle Growth, Differentiation, and Function , 2010, Arteriosclerosis, thrombosis, and vascular biology.

[52]  Chunxiang Zhang,et al.  MicroRNA Expression Signature and Antisense-Mediated Depletion Reveal an Essential Role of MicroRNA in Vascular Neointimal Lesion Formation , 2007, Circulation research.

[53]  George A. Calin,et al.  MicroRNAs — the micro steering wheel of tumour metastases , 2009, Nature Reviews Cancer.

[54]  G. Hannon,et al.  Small RNA sorting: matchmaking for Argonautes , 2011, Nature Reviews Genetics.

[55]  Christina Thaller,et al.  miR-19, miR-101 and miR-130 co-regulate ATXN1 levels to potentially modulate SCA1 pathogenesis , 2008, Nature Neuroscience.

[56]  Florian Caiment,et al.  A mutation creating a potential illegitimate microRNA target site in the myostatin gene affects muscularity in sheep , 2006, Nature Genetics.

[57]  Araxi O. Urrutia,et al.  Alternative Splicing: A Potential Source of Functional Innovation in the Eukaryotic Genome , 2012, International journal of evolutionary biology.

[58]  M. Esteller Non-coding RNAs in human disease , 2011, Nature Reviews Genetics.

[59]  S. Hammond MicroRNAs as tumor suppressors , 2007, Nature Genetics.

[60]  Annick Harel-Bellan,et al.  A synonymous variant in IRGM alters a binding site for miR-196 and causes deregulation of IRGM-dependent xenophagy in Crohn's disease , 2011, Nature Genetics.

[61]  C. Croce Causes and consequences of microRNA dysregulation in cancer , 2009, Nature Reviews Genetics.

[62]  Michael T. McManus,et al.  Dysregulation of Cardiogenesis, Cardiac Conduction, and Cell Cycle in Mice Lacking miRNA-1-2 , 2007, Cell.

[63]  M. Frilander,et al.  Minor Splicing, Disrupted , 2011, Science.

[64]  Abhijit A. Patel,et al.  The splicing of U12‐type introns can be a rate‐limiting step in gene expression , 2002, The EMBO journal.

[65]  M. Esteller,et al.  Disrupted microRNA expression caused by Mecp2 loss in a mouse model of Rett syndrome , 2010, Epigenetics.