A Stacking-Based Approach to Identify Translated Upstream Open Reading Frames in Arabidopsis Thaliana

Upstream open reading frames (uORFs) are open reading frames located within the 5’ UTR of an mRNA. It is believed that translated uORFs reduce the translational efficiency of the main coding region, and play an important role in gene regulation. However, only few uORFs are experimentally characterized. In this paper, we use ribosome footprinting together with a stacking-based classification approach to identify translated uORFs in Arabidopsis thaliana. Our approach resulted in a set of 5360 potentially translated uORFs in 2051 genes. GO terms enriched in uORF-containing genes include gene regulation, signal transduction and metabolic pathway. The identified uORFs occur with a higher frequency in multi-isoform genes, and many uORFs are affected by alternative transcript start sites or alternative splicing events.

[1]  Satoshi Naito,et al.  Identification of novel Arabidopsis thaliana upstream open reading frames that control expression of the main coding sequences in a peptide sequence-dependent manner , 2015, Nucleic acids research.

[2]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[3]  Nicholas T. Ingolia,et al.  Ribosome Profiling of Mouse Embryonic Stem Cells Reveals the Complexity and Dynamics of Mammalian Proteomes , 2011, Cell.

[4]  Hiro Takahashi,et al.  BAIUCAS: a novel BLAST-based algorithm for the identification of upstream open reading frames with conserved amino acid sequences and its application to the Arabidopsis thaliana genome , 2012, Bioinform..

[5]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[6]  Nicholas T. Ingolia,et al.  Genome-Wide Analysis in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling , 2009, Science.

[7]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[8]  Erik Kristiansson,et al.  Predicting functional upstream open reading frames in Saccharomyces cerevisiae , 2009, BMC Bioinformatics.

[9]  K. Huse,et al.  Genome-wide search for novel human uORFs and N-terminal protein extensions using ribosomal footprinting , 2012, Genome research.

[10]  Bernard Zenko,et al.  Is Combining Classifiers Better than Selecting the Best One , 2002, ICML.

[11]  T. Nanmori,et al.  Posttranscriptional Regulation by the Upstream Open Reading Frame of the Phosphoethanolamine N-Methyltransferase Gene , 2006, Bioscience, biotechnology, and biochemistry.

[12]  Justin N. Vaughn,et al.  Regulation of plant translation by upstream open reading frames. , 2014, Plant science : an international journal of experimental plant biology.

[13]  M. Mayer,et al.  Abrogation of Upstream Open Reading Frame-mediated Translational Control of a Plant S-Adenosylmethionine Decarboxylase Results in Polyamine Disruption and Growth Perturbations* , 2002, The Journal of Biological Chemistry.

[14]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[15]  T. Girke,et al.  Translational dynamics revealed by genome-wide profiling of ribosome footprints in Arabidopsis , 2013, Proceedings of the National Academy of Sciences.

[16]  J. McCarthy,et al.  Regulation of fungal gene expression via short open reading frames in the mRNA 5′untranslated region , 2003, Molecular microbiology.

[17]  Y. Hanzawa,et al.  The dwarf phenotype of the Arabidopsis acl5 mutant is suppressed by a mutation in an upstream ORF of a bHLH gene , 2006, Development.

[18]  D. Morris,et al.  Upstream Open Reading Frames as Regulators of mRNA Translation , 2000, Molecular and Cellular Biology.

[19]  Marija Cvijovic,et al.  Identification of putative regulatory upstream ORFs in the yeast genome using heuristics and evolutionary conservation , 2007, BMC Bioinform..

[20]  Joseph A. Rothnagel,et al.  Emerging evidence for functional peptides encoded by short open reading frames , 2014, Nature Reviews Genetics.

[21]  Soonmee Jeon,et al.  Upstream open reading frames regulate the cell cycle‐dependent expression of the RNA helicase Rok1 in Saccharomyces cerevisiae , 2010, FEBS letters.

[22]  L. Herrera-Estrella,et al.  Translational regulation of Arabidopsis XIPOTL1 is modulated by phosphocholine levels via the phylogenetically conserved upstream open reading frame 30. , 2012, Journal of experimental botany.

[23]  Justin N. Vaughn,et al.  On the functions of the h subunit of eukaryotic initiation factor 3 in late stages of translation initiation , 2007, Genome Biology.

[24]  V. Mootha,et al.  Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans , 2009, Proceedings of the National Academy of Sciences.

[25]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[26]  Zhou Du,et al.  agriGO: a GO analysis toolkit for the agricultural community , 2010, Nucleic Acids Res..