RiboFlow, RiboR and RiboPy: an ecosystem for analyzing ribosome profiling data at read length resolution

Abstract Summary Ribosome occupancy measurements enable protein abundance estimation and infer mechanisms of translation. Recent studies have revealed that sequence read lengths in ribosome profiling data are highly variable and carry critical information. Consequently, data analyses require the computation and storage of multiple metrics for a wide range of ribosome footprint lengths. We developed a software ecosystem including a new efficient binary file format named ‘ribo’. Ribo files store all essential data grouped by ribosome footprint lengths. Users can assemble ribo files using our RiboFlow pipeline that processes raw ribosomal profiling sequencing data. RiboFlow is highly portable and customizable across a large number of computational environments with built-in capabilities for parallelization. We also developed interfaces for writing and reading ribo files in the R (RiboR) and Python (RiboPy) environments. Using RiboR and RiboPy, users can efficiently access ribosome profiling quality control metrics, generate essential plots and carry out analyses. Altogether, these components create a software ecosystem for researchers to study translation through ribosome profiling. Availability and implementation For a quickstart, please see https://ribosomeprofiling.github.io. Source code, installation instructions and links to documentation are available on GitHub: https://github.com/ribosomeprofiling. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Nicholas T. Ingolia,et al.  Ribosome Profiling of Mouse Embryonic Stem Cells Reveals the Complexity and Dynamics of Mammalian Proteomes , 2011, Cell.

[2]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[3]  Nicholas T. Ingolia,et al.  Genome-Wide Analysis in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling , 2009, Science.

[4]  Lior Pachter,et al.  The Barcode, UMI, Set format and BUStools , 2018 .

[5]  L. Foster,et al.  Protein synthesis rate is the predominant regulator of protein expression during differentiation , 2013, Molecular systems biology.

[6]  Jason Gertz,et al.  XPRESSyourself: Enhancing and Automating the Ribosome Profiling and RNA-Seq Analysis Toolkit , 2019 .

[7]  Teemu P. Miettinen,et al.  Modified ribosome profiling reveals high abundance of ribosome protected mRNA fragments derived from 3′ untranslated regions , 2014, Nucleic acids research.

[8]  Steffen Heber,et al.  RiboStreamR: a web application for quality control, analysis, and visualization of Ribo-seq data , 2019, BMC Genomics.

[9]  R. Green,et al.  High-Resolution Ribosome Profiling Defines Discrete Ribosome Elongation States and Translational Regulation during Cellular Stress. , 2019, Molecular cell.

[10]  D. Gatfield,et al.  Transcriptome-wide sites of collided ribosomes reveal principles of translational pausing , 2019, bioRxiv.

[11]  R. Green,et al.  An evolutionarily conserved ribosome-rescue pathway maintains epidermal homeostasis , 2018, Nature.

[12]  P. Brown,et al.  Distinct stages of the translation elongation cycle revealed by sequencing ribosome-protected mRNA fragments , 2014, eLife.

[13]  R. Aebersold,et al.  Quantitative Analysis of Fission Yeast Transcriptomes and Proteomes in Proliferating and Quiescent Cells , 2012, Cell.

[14]  Rachel Green,et al.  Dom34 Rescues Ribosomes in 3′ Untranslated Regions , 2014, Cell.

[15]  Joshua B. Plotkin,et al.  riboviz: analysis and visualization of ribosome profiling datasets , 2017, BMC Bioinformatics.

[16]  Zhi Xie,et al.  Computational resources for ribosome profiling: from database to Web server and software , 2019, Briefings Bioinform..

[17]  Neva C. Durand,et al.  Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. , 2016, Cell systems.

[18]  Michael P Snyder,et al.  Integrative analysis of RNA, translation, and protein levels reveals distinct regulatory variation across humans , 2015, Genome research.

[19]  Pascal Barbry,et al.  RiboProfiling: a Bioconductor package for standard Ribo-seq pipeline processing , 2016, F1000Research.

[20]  Michael J. Emanuele,et al.  A proteomic chronology of gene expression through the cell cycle in human myeloid leukemia cells , 2014, eLife.

[21]  M. Selbach,et al.  Global quantification of mammalian gene expression control , 2011, Nature.

[22]  Thomas J. Hardcastle,et al.  The use of duplex-specific nuclease in ribosome profiling and a user-friendly software package for Ribo-seq data analysis , 2015, RNA.

[23]  Anna M. McGeachy,et al.  The small molecule ISRIB reverses the effects of eIF2α phosphorylation on translation and stress granule assembly , 2015, eLife.

[24]  P. Walter,et al.  Ribosome pausing and stacking during translation of a eukaryotic mRNA. , 1988, The EMBO journal.

[25]  Paolo Di Tommaso,et al.  Nextflow enables reproducible computational workflows , 2017, Nature Biotechnology.

[26]  Eivind Valen,et al.  Shoelaces: an interactive tool for ribosome profiling processing and visualization , 2018, BMC Genomics.

[27]  Jeffrey A. Hussmann,et al.  Ribosome Profiling: Global Views of Translation. , 2018, Cold Spring Harbor perspectives in biology.

[28]  Nezar Abdennur,et al.  Cooler: scalable storage for Hi-C data and other genomically-labeled arrays , 2019, bioRxiv.

[29]  R. Green,et al.  Translation of poly(A) tails leads to precise mRNA cleavage. , 2017, RNA.