Application of annotation-agnostic RNA sequencing data analysis tools for biomarker discovery in liquid biopsy

RNA sequencing analysis is an important field in the study of extracellular vesicles (EVs), as these particles contain a variety of RNA species that may have diagnostic, prognostic and predictive value. Many of the bioinformatics tools currently used to analyze EV cargo rely on third-party annotations. Recently, analysis of unannotated expressed RNAs has become of interest, since these may provide complementary information to traditional annotated biomarkers or may help refine biological signatures used in machine learning by including unknown regions. Here we perform a comparative analysis of annotation-free and classical read-summarization tools for the analysis of RNA sequencing data generated for EVs isolated from persons with amyotrophic lateral sclerosis (ALS) and healthy donors. Differential expression analysis and digital-droplet PCR validation of unannotated RNAs also confirmed their existence and demonstrates the usefulness of including such potential biomarkers in transcriptome analysis. We show that find-then-annotate methods perform similarly to standard tools for the analysis of known features, and can also identify unannotated expressed RNAs, two of which were validated as overexpressed in ALS samples. We demonstrate that these tools can therefore be used for a stand-alone analysis or easily integrated into current workflows and may be useful for re-analysis as annotations can be integrated post hoc.

[1]  R. Sebra,et al.  Unannotated small RNA clusters associated with circulating extracellular vesicles detect early stage liver cancer , 2021, Gut.

[2]  Shanshan Liu,et al.  Extracellular RNA in systemic lupus erythematosus , 2019, ExRNA.

[3]  M. Zytnicki,et al.  Finding differentially expressed sRNA-Seq regions with srnadiff , 2019, bioRxiv.

[4]  Tim Jeske,et al.  DEUS: an R package for accurate small RNA profiling based on differential expression of unique sequences , 2019, Bioinform..

[5]  J. Oliver,et al.  sRNAbench and sRNAtoolbox 2019: intuitive fast small RNA profiling and differential expression , 2019, Nucleic Acids Res..

[6]  N. Crapoulet,et al.  Identification of a circulating miRNA signature in extracellular vesicles collected from amyotrophic lateral sclerosis patients , 2019, Brain Research.

[7]  M. Gerstein,et al.  exceRpt: A Comprehensive Analytic Platform for Extracellular RNA Profiling. , 2019, Cell systems.

[8]  A. Tonevitsky,et al.  Transcriptome of Extracellular Vesicles: State-of-the-Art , 2019, Front. Immunol..

[9]  Jing Xu,et al.  Minimal information for studies of extracellular vesicles 2018 (MISEV2018): a position statement of the International Society for Extracellular Vesicles and update of the MISEV2014 guidelines , 2018, Journal of Extracellular Vesicles.

[10]  M. Speicher,et al.  Current and future perspectives of liquid biopsies in genomics-driven oncology , 2018, Nature Reviews Genetics.

[11]  Taeyoung Kang,et al.  Vesiclepedia 2019: a compendium of RNA, proteins, lipids and metabolites in extracellular vesicles , 2018, Nucleic Acids Res..

[12]  Yu Zheng,et al.  piRBase: a comprehensive database of piRNA sequences , 2018, Nucleic Acids Res..

[13]  Valentina R Minciacchi,et al.  Large extracellular vesicles carry most of the tumour DNA circulating in prostate cancer patient plasma , 2018, Journal of extracellular vesicles.

[14]  P. Pouchin,et al.  sRNAPipe: a Galaxy-based pipeline for bioinformatic in-depth exploration of small RNAseq data , 2018, Mobile DNA.

[15]  T. Gant,et al.  Multi-Method Characterization of the Human Circulating Microbiome , 2018, bioRxiv.

[16]  Vincent Moulton,et al.  The UEA sRNA Workbench (version 4.4): a comprehensive suite of tools for analyzing miRNAs and sRNAs , 2018, Bioinform..

[17]  I. Struman,et al.  Exploring the RNA landscape of endothelial exosomes , 2018, RNA.

[18]  S. Bonn,et al.  Oasis 2: improved online analysis of small RNA-seq data , 2018, BMC Bioinformatics.

[19]  Graça Raposo,et al.  Shedding light on the cell biology of extracellular vesicles , 2018, Nature Reviews Molecular Cell Biology.

[20]  Isidore Rigoutsos,et al.  MINTbase v2.0: a comprehensive database for tRNA-derived fragments that includes nuclear and mitochondrial fragments from all The Cancer Genome Atlas projects , 2017, Nucleic Acids Res..

[21]  David J. Galas,et al.  sRNAnalyzer—a flexible and customizable small RNA sequencing data analysis pipeline , 2017, Nucleic acids research.

[22]  Klaus Pantel,et al.  Liquid Biopsy: Current Status and Future Perspectives , 2017, Oncology Research and Treatment.

[23]  Rafael A. Irizarry,et al.  Flexible expressed region analysis for RNA-seq with derfinder , 2015, bioRxiv.

[24]  Alissa M. Weaver,et al.  KRAS-dependent sorting of miRNA to exosomes , 2015, eLife.

[25]  Marie-France Sagot,et al.  Mirinho: An efficient and general plant and animal pre-miRNA predictor for genomic and deep sequencing data , 2015, BMC Bioinformatics.

[26]  Michael Hackenberg,et al.  sRNAtoolbox: an integrated collection of small RNA research tools , 2015, Nucleic Acids Res..

[27]  S. Salzberg,et al.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads , 2015, Nature Biotechnology.

[28]  Alyssa C. Frazee,et al.  Polyester: Simulating RNA-Seq Datasets With Differential Transcript Expression , 2014, bioRxiv.

[29]  Jikai Lei,et al.  miR-PREFeR: an accurate, fast and easy-to-use plant miRNA prediction tool using small RNA-Seq data , 2014, Bioinform..

[30]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[31]  Matthew B. Stocks,et al.  CoLIde: a bioinformatics tool for CO-expression-based small RNA Loci Identification using high-throughput sequencing data. , 2013, RNA biology.

[32]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[33]  M. Axtell ShortStack: comprehensive annotation and quantification of small RNA genes. , 2013, RNA.

[34]  C. Nelson,et al.  miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data , 2012, Nucleic acids research.

[35]  Sebastian D. Mackowiak,et al.  miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades , 2011, Nucleic acids research.

[36]  Alessandra Carbone,et al.  MIReNA: finding microRNAs with high accuracy and no learning at genome scale and from deep sequencing data , 2010, Bioinform..

[37]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[38]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[39]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[40]  N. Rajewsky,et al.  Discovering microRNAs from deep sequencing data using miRDeep , 2008, Nature Biotechnology.

[41]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[42]  Christian S. Jensen,et al.  : EXPLORING THE , 2022 .