A Regression-Based Analysis of Ribosome-Profiling Data Reveals a Conserved Complexity to Mammalian Translation.

A fundamental goal of genomics is to identify the complete set of expressed proteins. Automated annotation strategies rely on assumptions about protein-coding sequences (CDSs), e.g., they are conserved, do not overlap, and exceed a minimum length. However, an increasing number of newly discovered proteins violate these rules. Here we present an experimental and analytical framework, based on ribosome profiling and linear regression, for systematic identification and quantification of translation. Application of this approach to lipopolysaccharide-stimulated mouse dendritic cells and HCMV-infected human fibroblasts identifies thousands of novel CDSs, including micropeptides and variants of known proteins, that bear the hallmarks of canonical translation and exhibit translation levels and dynamics comparable to that of annotated CDSs. Remarkably, many translation events are identified in both mouse and human cells even when the peptide sequence is not conserved. Our work thus reveals an unexpected complexity to mammalian translation suited to provide both conserved regulatory or protein-based functions.

[1]  Nicholas T Ingolia,et al.  Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. , 2014, Cell reports.

[2]  D. Peabody,et al.  Translation initiation at non-AUG triplets in mammalian cells. , 1989, The Journal of biological chemistry.

[3]  Steven A Carr,et al.  Integrated proteomic analysis of post-translational modifications by serial enrichment , 2013, Nature Methods.

[4]  Adi Kimchi,et al.  A Novel Form of DAP5 Protein Accumulates in Apoptotic Cells as a Result of Caspase Cleavage and Internal Ribosome Entry Site-Mediated Translation , 2000, Molecular and Cellular Biology.

[5]  C. Bult,et al.  Transcript Annotation in FANTOM3: Mouse Gene Catalog Based on Physical cDNAs , 2006, PLoS genetics.

[6]  B. Dobberstein,et al.  Common Principles of Protein Translocation Across Membranes , 1996, Science.

[7]  Nicholas T. Ingolia,et al.  Ribosome Profiling Provides Evidence that Large Noncoding RNAs Do Not Encode Proteins , 2013, Cell.

[8]  D. Tautz,et al.  A Segmentation Gene in Tribolium Produces a Polycistronic mRNA that Codes for Multiple Conserved Peptides , 2006, Cell.

[9]  John M. Shelton,et al.  A Micropeptide Encoded by a Putative Long Noncoding RNA Regulates Muscle Performance , 2015, Cell.

[10]  Sumio Sugano,et al.  Diversity of Translation Start Sites May Define Increased Complexity of the Human Short ORFeome*S , 2007, Molecular & Cellular Proteomics.

[11]  Frances M. G. Pearl,et al.  Conserved Regulation of Cardiac Calcium Uptake by Peptides Encoded in Small Open Reading Frames , 2013, Science.

[12]  Nicholas T. Ingolia,et al.  Ribosome Profiling of Mouse Embryonic Stem Cells Reveals the Complexity and Dynamics of Mammalian Proteomes , 2011, Cell.

[13]  D. Morris,et al.  Upstream Open Reading Frames as Regulators of mRNA Translation , 2000, Molecular and Cellular Biology.

[14]  G. Schlüter,et al.  Evidence for translational repression of the SOCS-1 major open reading frame by an upstream open reading frame. , 2000, Biochemical and biophysical research communications.

[15]  Yasuaki Oda,et al.  Evolutionarily conserved non-AUG translation initiation in NAT1/p97/DAP5 (EIF4G2). , 2005, Genomics.

[16]  Nicholas T. Ingolia,et al.  Genome-Wide Analysis in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling , 2009, Science.

[17]  P. Jin,et al.  RNA-Binding Protein FXR2 Regulates Adult Hippocampal Neurogenesis by Reducing Noggin Expression , 2011, Neuron.

[18]  J. Rinn,et al.  Ribosome profiling reveals resemblance between long non-coding RNAs and 5′ leaders of coding RNAs , 2013, Development.

[19]  A. Schier,et al.  Identifying (non‐)coding RNAs and small peptides: Challenges and opportunities , 2015, BioEssays : news and reviews in molecular, cellular and developmental biology.

[20]  François-Michel Boisvert,et al.  Direct Detection of Alternative Open Reading Frames Translation Products in Human Significantly Expands the Proteome , 2013, PloS one.

[21]  Nicholas T. Ingolia,et al.  A Bicistronic MAVS Transcript Highlights a Class of Truncated Variants in Antiviral Immunity , 2014, Cell.

[22]  Marco Y. Hein,et al.  Decoding Human Cytomegalovirus , 2012, Science.

[23]  B. Shen,et al.  Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution , 2012, Proceedings of the National Academy of Sciences.

[24]  Audrey M. Michel,et al.  Observation of dually decoded regions of the human genome using ribosome profiling data , 2012, Genome research.

[25]  Jiao Ma,et al.  Toddler: An Embryonic Signal That Promotes Cell Movement via Apelin Receptors , 2014, Science.

[26]  Nikolaus Rajewsky,et al.  Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation , 2014, The EMBO journal.

[27]  K. Gupta,et al.  Initiation of translation at CUG, GUG, and ACG codons in mammalian cells. , 1990, Gene.

[28]  Maxwell R. Mumbach,et al.  Dynamic profiling of the protein life cycle in response to pathogens , 2015, Science.

[29]  Wenqian Hu,et al.  Translation of small open reading frames within unannotated RNA transcripts in Saccharomyces cerevisiae. , 2014, Cell reports.

[30]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[31]  Alan Saghatelian,et al.  A Human Short Open Reading Frame (sORF)-encoded Polypeptide That Stimulates DNA End Joining* , 2014, The Journal of Biological Chemistry.

[32]  G. Peters,et al.  Subcellular fate of the lnt-2 oncoprotein is determined by choice of initiation codon , 1990, Nature.

[33]  Ying Chen Eyre-Walker,et al.  Extensive translation of small Open Reading Frames revealed by Poly-Ribo-Seq , 2014, eLife.

[34]  A. Varshavsky,et al.  The N-end rule: functions, mysteries, uses. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[35]  P. Brown,et al.  Distinct stages of the translation elongation cycle revealed by sequencing ribosome-protected mRNA fragments , 2014, eLife.

[36]  N. Sonenberg,et al.  Regulation of SOCS-1 Expression by Translational Repression* , 2000, The Journal of Biological Chemistry.

[37]  Manolis Kellis,et al.  PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions , 2011, Bioinform..

[38]  Jun Kawai,et al.  The Abundance of Short Proteins in the Mammalian Proteome , 2006, PLoS genetics.

[39]  Patrick B. F. O'Connor,et al.  Translation of 5′ leaders is pervasive in genes resistant to eIF2 repression , 2015, eLife.

[40]  N. Shastri,et al.  Leucine-tRNA Initiates at CUG Start Codons for Protein Synthesis and Presentation by MHC Class I , 2012, Science.

[41]  David Haussler,et al.  The UCSC Genome Browser database: update 2010 , 2009, Nucleic Acids Res..

[42]  J. Darnell,et al.  Discrimination of common and unique RNA-binding activities among Fragile X mental retardation protein paralogs , 2009, Human molecular genetics.

[43]  Joseph A. Rothnagel,et al.  Emerging evidence for functional peptides encoded by short open reading frames , 2014, Nature Reviews Genetics.

[44]  Rona S. Gertner,et al.  Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells , 2013, Nature.