RED-ML: a novel, effective RNA editing detection method based on machine learning

Abstract With the advancement of second generation sequencing techniques, our ability to detect and quantify RNA editing on a global scale has been vastly improved. As a result, RNA editing is now being studied under a growing number of biological conditions so that its biochemical mechanisms and functional roles can be further understood. However, a major barrier that prevents RNA editing from being a routine RNA-seq analysis, similar to gene expression and splicing analysis, for example, is the lack of user-friendly and effective computational tools. Based on years of experience of analyzing RNA editing using diverse RNA-seq datasets, we have developed a software tool, RED-ML: RNA Editing Detection based on Machine learning (pronounced as “red ML”). The input to RED-ML can be as simple as a single BAM file, while it can also take advantage of matched genomic variant information when available. The output not only contains detected RNA editing sites, but also a confidence score to facilitate downstream filtering. We have carefully designed validation experiments and performed extensive comparison and analysis to show the efficiency and effectiveness of RED-ML under different conditions, and it can accurately detect novel RNA editing sites without relying on curated RNA editing databases. We have also made this tool freely available via GitHub . We have developed a highly accurate, speedy and general-purpose tool for RNA editing detection using RNA-seq data. With the availability of RED-ML, it is now possible to conveniently make RNA editing a routine analysis of RNA-seq. We believe this can greatly benefit the RNA editing research community and has profound impact to accelerate our understanding of this intriguing posttranscriptional modification process.

[1]  Yi-Tung Chen,et al.  Functional Impact of RNA editing and ADARs on regulation of gene expression: perspectives from deep sequencing studies , 2014, Cell & Bioscience.

[2]  Brenda L. Bass,et al.  A developmentally regulated activity that unwinds RNA duplexes , 1987, Cell.

[3]  Li Yang,et al.  The difficult calls in RNA editing , 2012, Nature Biotechnology.

[4]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[5]  Jin Billy Li,et al.  Edinburgh Research Explorer Identifying Rna Editing Sites Using Rna Sequencing Data Alone , 2022 .

[6]  Walid S. Saba,et al.  ANALYSIS AND DESIGN , 2000 .

[7]  Hiroki Ueda,et al.  A biochemical landscape of A-to-I RNA editing in the human brain transcriptome , 2014, Genome research.

[8]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[9]  B. Williams,et al.  RNA editing in the human ENCODE RNA-seq data , 2012, Genome research.

[10]  Xinshu Xiao,et al.  Analysis and design of RNA sequencing experiments for identifying RNA editing and other single-nucleotide variants. , 2013, RNA.

[11]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[12]  H. Hundley,et al.  To edit or not to edit: regulation of ADAR editing specificity and efficiency , 2016, Wiley interdisciplinary reviews. RNA.

[13]  Angela Gallo,et al.  The RNA editing enzymes ADARs: mechanism of action and human disease , 2014, Cell and Tissue Research.

[14]  Jin Billy Li,et al.  Accurate identification of human Alu and non-Alu RNA editing sites , 2012, Nature Methods.

[15]  Eli Eisenberg,et al.  Elevated RNA Editing Activity Is a Major Contributor to Transcriptomic Diversity in Tumors. , 2015, Cell reports.

[16]  Wenwei Zhang,et al.  Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome , 2012, Nature Biotechnology.

[17]  B. Frey,et al.  The human splicing code reveals new insights into the genetic determinants of disease , 2015, Science.

[18]  Jens Lagergren,et al.  RNA editing of non-coding RNA and its role in gene regulation. , 2015, Biochimie.

[19]  Angela Gallo,et al.  A-to-I RNA editing: the "ADAR" side of human cancer. , 2012, Seminars in cell & developmental biology.

[20]  Alfredo Ferro,et al.  A-to-I RNA Editing: Current Knowledge Sources and Computational Approaches with Special Emphasis on Non-Coding RNA Molecules , 2015, Front. Bioeng. Biotechnol..

[21]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[22]  X. Xiao,et al.  Genome Sequence-Independent Identification of RNA Editing Sites , 2015, Nature Methods.

[23]  Xinshu Xiao,et al.  RASER: reads aligner for SNPs and editing sites of RNA , 2015, Bioinform..

[24]  Pei Zhang,et al.  RES-Scanner: a software package for genome-wide identification of RNA-editing sites , 2016, GigaScience.

[25]  Kazuko Nishikura,et al.  Adenosine-to-inosine RNA editing and human disease , 2013, Genome Medicine.

[26]  K. Nishikura,et al.  A-to-I editing of coding and non-coding RNAs by ADARs , 2015, Nature Reviews Molecular Cell Biology.

[27]  R. Reenan,et al.  The intricate relationship between RNA structure, editing, and splicing. , 2012, Seminars in cell & developmental biology.

[28]  Eli Eisenberg,et al.  A-to-I RNA editing occurs at over a hundred million genomic sites, located in a majority of human genes , 2014, Genome research.

[29]  Leng Han,et al.  The Genomic Landscape and Clinical Relevance of A-to-I RNA Editing in Human Cancers. , 2015, Cancer cell.

[30]  George M Church,et al.  Deciphering the functions and regulation of brain-enriched A-to-I RNA editing , 2013, Nature Neuroscience.

[31]  Gideon Rechavi,et al.  Adenosine-to-inosine RNA editing meets cancer. , 2011, Carcinogenesis.

[32]  Zhiyu Peng,et al.  Caste-specific RNA editomes in the leaf-cutting ant Acromyrmex echinatior , 2014, Nature Communications.

[33]  D. Melton,et al.  Antisense RNA injections in fertilized frog eggs reveal an RNA duplex unwinding activity , 1987, Cell.

[34]  Yongxian Yuan,et al.  An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq , 2016, BMC Genomics.