Detecting actively translated open reading frames in ribosome profiling data

RNA-sequencing protocols can quantify gene expression regulation from transcription to protein synthesis. Ribosome profiling (Ribo-seq) maps the positions of translating ribosomes over the entire transcriptome. We have developed RiboTaper (available at https://ohlerlab.mdc-berlin.de/software/), a rigorous statistical approach that identifies translated regions on the basis of the characteristic three-nucleotide periodicity of Ribo-seq data. We used RiboTaper with deep Ribo-seq data from HEK293 cells to derive an extensive map of translation that covered open reading frame (ORF) annotations for more than 11,000 protein-coding genes. We also found distinct ribosomal signatures for several hundred upstream ORFs and ORFs in annotated noncoding genes (ncORFs). Mass spectrometry data confirmed that RiboTaper achieved excellent coverage of the cellular proteome. Although dozens of novel peptide products were validated in this manner, few of the currently annotated long noncoding RNAs appeared to encode stable polypeptides. RiboTaper is a powerful method for comprehensive de novo identification of actively used ORFs from Ribo-seq data.

[1]  Audrey M. Michel,et al.  GWIPS-viz: development of a ribo-seq genome browser , 2013, Nucleic Acids Res..

[2]  Ying Chen Eyre-Walker,et al.  Extensive translation of small Open Reading Frames revealed by Poly-Ribo-Seq , 2014, eLife.

[3]  Karim Rahim,et al.  Applications of Multitaper Spectral Analysis to Nonstationary Data , 2014 .

[4]  Nicholas T. Ingolia,et al.  Genome-Wide Analysis in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling , 2009, Science.

[5]  J. Steitz Nucleotide sequences of the ribosomal binding sites of bacteriophage R17 RNA. , 1969, Cold Spring Harbor symposia on quantitative biology.

[6]  James B. Brown,et al.  Long noncoding RNAs are rarely translated in two human cell lines , 2012, Genome research.

[7]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[8]  Anna M. McGeachy,et al.  The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments , 2012, Nature Protocols.

[9]  Nicholas T Ingolia,et al.  Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. , 2014, Cell reports.

[10]  Carol G. Maclennan,et al.  Propagation of solar oscillations through the interplanetary medium , 1995, Nature.

[11]  Miguel A. Andrade-Navarro,et al.  uORFdb—a comprehensive literature database on eukaryotic uORF biology , 2013, Nucleic Acids Res..

[12]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[13]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[14]  Martin Vingron,et al.  Translational regulation shapes the molecular landscape of complex disease phenotypes , 2015, Nature Communications.

[15]  L. Romão,et al.  Gene Expression Regulation by Upstream Open Reading Frames and Human Disease , 2013, PLoS genetics.

[16]  Cathy H. Wu,et al.  A fast Peptide Match service for UniProt Knowledgebase , 2013, Bioinform..

[17]  J. Kocher,et al.  CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model , 2013, Nucleic acids research.

[18]  Shu-Bing Qian,et al.  Quantitative profiling of initiating ribosomes in vivo , 2014, Nature Methods.

[19]  P. Brown,et al.  Distinct stages of the translation elongation cycle revealed by sequencing ribosome-protected mRNA fragments , 2014, eLife.

[20]  Nikolaus Rajewsky,et al.  Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation , 2014, The EMBO journal.

[21]  Christian Sommer,et al.  IPG strip-based peptide fractionation for shotgun proteomics. , 2014, Methods in molecular biology.

[22]  Michael F. Lin,et al.  PhyloCSF: a comparative genomics method to distinguish protein-coding and non-coding regions , 2010 .

[23]  Patrick B. F. O'Connor,et al.  Translation of 5′ leaders is pervasive in genes resistant to eIF2 repression , 2015, eLife.

[24]  S. Grellscheid,et al.  Newcastle University Eprints Date Deposited: 4 This Work Is Licensed under a Creative Commons Attribution 4.0 International License Profiling Data Sets Detecting Translational Regulation by Change Point Analysis of Ribosome Material Supplemental Open Access Detecting Translational Regulation by Chan , 2022 .

[25]  Sebastian D. Mackowiak,et al.  Extensive identification and analysis of conserved small ORFs in animals , 2015, Genome Biology.

[26]  W. Van Criekinge,et al.  PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration , 2014, Nucleic acids research.

[27]  J. Mata,et al.  The translational landscape of fission yeast meiosis and sporulation , 2014, Nature Structural &Molecular Biology.

[28]  Nicholas T. Ingolia,et al.  Ribosome Profiling Provides Evidence that Large Noncoding RNAs Do Not Encode Proteins , 2013, Cell.

[29]  D. Thomson,et al.  Spectrum estimation and harmonic analysis , 1982, Proceedings of the IEEE.

[30]  Vadim N. Gladyshev,et al.  Translation inhibitors cause abnormalities in ribosome profiling experiments , 2014, Nucleic acids research.

[31]  Differential protein occupancy profiling of the mRNA transcriptome , 2014, Genome Biology.

[32]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[33]  Emery N. Brown,et al.  A Review of Multitaper Spectral Analysis , 2014, IEEE Transactions on Biomedical Engineering.

[34]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[35]  J. Rinn,et al.  Ribosome profiling reveals resemblance between long non-coding RNAs and 5′ leaders of coding RNAs , 2013, Development.

[36]  Rachel Legendre,et al.  RiboTools: a Galaxy toolbox for qualitative ribosome profiling analysis , 2015, Bioinform..

[37]  Daphne Koller,et al.  Causal signals between codon bias, mRNA structure, and the efficiency of translation and elongation , 2014, Molecular systems biology.

[38]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[39]  Michael B. Black,et al.  IVT-seq reveals extreme bias in RNA sequencing , 2014, Genome Biology.

[40]  K. Huse,et al.  Genome-wide search for novel human uORFs and N-terminal protein extensions using ribosomal footprinting , 2012, Genome research.

[41]  Audrey M. Michel,et al.  Observation of dually decoded regions of the human genome using ribosome profiling data , 2012, Genome research.

[42]  M. Mann,et al.  MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification , 2008, Nature Biotechnology.

[43]  Jiao Ma,et al.  Toddler: An Embryonic Signal That Promotes Cell Movement via Apelin Receptors , 2014, Science.

[44]  Hunter B. Fraser,et al.  Accounting for biases in riboprofiling data indicates a major role for proline in stalling translation , 2014, Genome research.

[45]  Richard A. Olshen,et al.  Assessing gene-level translational control from ribosome profiling , 2013, Bioinform..