A computational platform for high-throughput analysis of RNA sequences and modifications by mass spectrometry

The field of epitranscriptomics is growing in importance, with chemical modification of RNA being associated with a wide variety of biological phenomena. A pivotal challenge in this area is the identification of modified RNA residues within their sequence contexts. Next-generation sequencing approaches are generally unable to capture modifications, although workarounds for some epigenetic marks exist. Mass spectrometry (MS) offers a comprehensive solution by using analogous approaches to shotgun proteomics. However, software support for the analysis of RNA MS data is inadequate at present and does not allow high-throughput processing. In particular, existing software solutions lack the raw performance and statistical grounding to efficiently handle the large variety of modifications present on RNA. We present a free and open-source database search engine for RNA MS data, called NucleicAcidSearchEngine (NASE), that addresses these shortcomings. We demonstrate the capability of NASE to reliably identify a wide range of modified RNA sequences in three original datasets of varying complexity. In a human tRNA sample, we characterize over 20 different modification types simultaneously and find many cases of incomplete modification.

[1]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[2]  Qiang Dong,et al.  Epitranscriptomic m ( 6 ) A Regulation of Axon Regeneration in the Adult Mammalian Nervous System , 2022 .

[3]  Ludovic C. Gillet,et al.  Mass Spectrometry Applied to Bottom-Up Proteomics: Entering the High-Throughput Era for Hypothesis Testing. , 2016, Annual review of analytical chemistry.

[4]  Gary L. Glish,et al.  Tandem Mass Spectrometry of Small, Multiply Charged Oligonucleotides , 1992, Journal of the American Society for Mass Spectrometry.

[5]  C. Huber,et al.  Analysis of nucleic acids by on-line liquid chromatography-mass spectrometry. , 2001, Mass spectrometry reviews.

[6]  P. R. Srinivasan,et al.  The Methylation of Nucleic Acids , 1966 .

[7]  Mark Akeson,et al.  Reading canonical and modified nucleotides in 16S ribosomal RNA using nanopore direct RNA sequencing , 2017, bioRxiv.

[8]  B. Reinhart,et al.  The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans , 2000, Nature.

[9]  Jef Rozenski,et al.  SOS: A simple interactive program for ab initio oligonucleotide sequencing by mass spectrometry , 2002, Journal of the American Society for Mass Spectrometry.

[10]  Tao Pan,et al.  Dynamic RNA Modifications in Gene Expression Regulation , 2017, Cell.

[11]  Chengqi Yi,et al.  Analysis of RNA base modification and structural rearrangement by single-molecule real-time detection of reverse transcription , 2013, Journal of Nanobiotechnology.

[12]  Wei Zheng,et al.  Epitranscriptomic m6A Regulation of Axon Regeneration in the Adult Mammalian Nervous System , 2018, Neuron.

[13]  Oliver Kohlbacher,et al.  TOPPView: an open-source viewer for mass spectrometry data. , 2009, Journal of proteome research.

[14]  Yuri Motorin,et al.  Detecting RNA modifications in the epitranscriptome: predict and validate , 2017, Nature Reviews Genetics.

[15]  Muneesh Tewari,et al.  Intact MicroRNA Analysis Using High Resolution Mass Spectrometry , 2013, Journal of The American Society for Mass Spectrometry.

[16]  Kin-Fan Au,et al.  PacBio Sequencing and Its Applications , 2015, Genom. Proteom. Bioinform..

[17]  Clement T Y Chan,et al.  Quantitative analysis of ribonucleoside modifications in tRNA by HPLC-coupled mass spectrometry , 2014, Nature Protocols.

[18]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[19]  Patrick A. Limbach,et al.  Improved application of RNAModMapper - An RNA modification mapping software tool - For analysis of liquid chromatography tandem mass spectrometry (LC-MS/MS) data. , 2019, Methods.

[20]  Y. Motorin,et al.  Multisite-specific tRNA:m5C-methyltransferase (Trm4) in yeast Saccharomyces cerevisiae: identification of the gene and substrate specificity of the enzyme. , 1999, RNA.

[21]  R. Beavis,et al.  A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. , 2003, Analytical chemistry.

[22]  Izabela Makałowska,et al.  Identification of human tRNA:m5C methyltransferase catalysing intron-dependent m5C formation in the first position of the anticodon of the pre-tRNA(CAA)Leu , 2006, Nucleic acids research.

[23]  Masato Taoka,et al.  The complete chemical structure of Saccharomyces cerevisiae rRNA: partial pseudouridylation of U2345 in 25S rRNA by snoRNA snR9 , 2016, Nucleic acids research.

[24]  Takehiro Yasukawa,et al.  Codon-specific translational defect caused by a wobble modification deficiency in mutant tRNA from a human mitochondrial disease. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Xiaoyu Cao,et al.  RNAModMapper: RNA Modification Mapping Software for Analysis of Liquid Chromatography Tandem Mass Spectrometry Data. , 2017, Analytical chemistry.

[26]  Janusz M. Bujnicki,et al.  MODOMICS: a database of RNA modification pathways. 2017 update , 2017, Nucleic Acids Res..

[27]  Hendrik Weisser,et al.  Targeted Feature Detection for Data-Dependent Shotgun Proteomics , 2017, Journal of proteome research.

[28]  Jernej Ule,et al.  The SMAD2/3 interactome reveals that TGFβ controls m6A mRNA methylation in pluripotency , 2018, Nature.

[29]  Knut Reinert,et al.  TOPPAS: a graphical workflow editor for the analysis of high-throughput proteomics data. , 2012, Journal of proteome research.

[30]  Guifang Jia,et al.  Reversible RNA adenosine methylation in biological regulation. , 2013, Trends in genetics : TIG.

[31]  Chengqi Yi,et al.  Epitranscriptome sequencing technologies: decoding RNA modifications , 2016, Nature Methods.

[32]  K. Reinert,et al.  OpenMS: a flexible open-source software platform for mass spectrometry data analysis , 2016, Nature Methods.

[33]  Janusz M Bujnicki,et al.  Distribution and frequencies of post-transcriptional modifications in tRNAs , 2014, RNA biology.

[34]  Knut Reinert,et al.  TOPP - the OpenMS proteomics pipeline , 2007, Bioinform..

[35]  Jernej Ule,et al.  Aberrant methylation of tRNAs links cellular stress to neuro-developmental disorders , 2014, The EMBO journal.

[36]  Jun Fan,et al.  The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider Audience* , 2014, Molecular & Cellular Proteomics.

[37]  Marcin Feder,et al.  MODOMICS: a database of RNA modification pathways , 2005, Nucleic Acids Res..

[38]  F. Davis,et al.  Ribonucleic acids from yeast which contain a fifth nucleotide. , 1957, The Journal of biological chemistry.

[39]  Misaki Akiyama,et al.  Ariadne: a database search engine for identification and chemical analysis of RNA using tandem mass spectrometry data , 2009, Nucleic acids research.

[40]  Tao Pan,et al.  Modifications and functional genomics of human transfer RNA , 2018, Cell Research.

[41]  William Stafford Noble,et al.  Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. , 2008, Journal of proteome research.

[42]  Christopher E. Mason,et al.  Single-nucleotide resolution mapping of m6A and m6Am throughout the transcriptome , 2015, Nature Methods.

[43]  Gideon Rechavi,et al.  The dynamic N1-methyladenosine methylome in eukaryotic messenger RNA , 2016, Nature.

[44]  Peter F. Stadler,et al.  tRNAdb 2009: compilation of tRNA sequences and tRNA genes , 2008, Nucleic Acids Res..

[45]  Schraga Schwartz,et al.  The m1A landscape on cytosolic and mitochondrial mRNA at single-base resolution , 2017, Nature.

[46]  Natalie I. Tasman,et al.  A Cross-platform Toolkit for Mass Spectrometry and Proteomics , 2012, Nature Biotechnology.

[47]  Lennart Martens,et al.  mzML—a Community Standard for Mass Spectrometry Data* , 2010, Molecular & Cellular Proteomics.

[48]  Jean-Louis Reymond,et al.  OMA and OPA—Software-Supported Mass Spectra Analysis of Native and Modified Nucleic Acids , 2013, Journal of The American Society for Mass Spectrometry.