uPEPperoni: An online tool for upstream open reading frame location and analysis of transcript conservation

BackgroundSeveral small open reading frames located within the 5′ untranslated regions of mRNAs have recently been shown to be translated. In humans, about 50% of mRNAs contain at least one upstream open reading frame representing a large resource of coding potential. We propose that some upstream open reading frames encode peptides that are functional and contribute to proteome complexity in humans and other organisms. We use the term uPEPs to describe peptides encoded by upstream open reading frames.ResultsWe have developed an online tool, termed uPEPperoni, to facilitate the identification of putative bioactive peptides. uPEPperoni detects conserved upstream open reading frames in eukaryotic transcripts by comparing query nucleotide sequences against mRNA sequences within the NCBI RefSeq database. The algorithm first locates the main coding sequence and then searches for open reading frames 5′ to the main start codon which are subsequently analysed for conservation. uPEPperoni also determines the substitution frequency for both the upstream open reading frames and the main coding sequence. In addition, the uPEPperoni tool produces sequence identity heatmaps which allow rapid visual inspection of conserved regions in paired mRNAs.ConclusionsuPEPperoni features user-nominated settings including, nucleotide match/mismatch, gap penalties, Ka/Ks ratios and output mode. The heatmap output shows levels of identity between any two sequences and provides easy recognition of conserved regions. Furthermore, this web tool allows comparison of evolutionary pressures acting on the upstream open reading frame against other regions of the mRNA. Additionally, the heatmap web applet can also be used to visualise the degree of conservation in any pair of sequences. uPEPperoni is freely available on an interactive web server at http://upep-scmb.biosci.uq.edu.au.

[1]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[2]  V. Mootha,et al.  Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans , 2009, Proceedings of the National Academy of Sciences.

[3]  Luciano Milanesi,et al.  Presence of ATG triplets in 5' untranslated regions of eukaryotic cDNAs correlates with a 'weak' context of the start codon , 2001, Bioinform..

[4]  M. Kozak,et al.  Pushing the limits of the scanning mechanism for initiation of translation , 2002, Gene.

[5]  J. Rinn,et al.  Peptidomic discovery of short open reading frame-encoded peptides in human cells , 2012, Nature chemical biology.

[6]  D. Morris,et al.  Upstream Open Reading Frames as Regulators of mRNA Translation , 2000, Molecular and Cellular Biology.

[7]  K. Nakai,et al.  Small open reading frames in 5' untranslated regions of mRnas. , 2003, Comptes rendus biologies.

[8]  Sumio Sugano,et al.  Diversity of Translation Start Sites May Define Increased Complexity of the Human Short ORFeome*S , 2007, Molecular & Cellular Proteomics.

[9]  K. Huse,et al.  Genome-wide search for novel human uORFs and N-terminal protein extensions using ribosomal footprinting , 2012, Genome research.

[10]  B. Shen,et al.  Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution , 2012, Proceedings of the National Academy of Sciences.

[11]  Chuong B. Do,et al.  Access the most recent version at doi: 10.1101/gr.926603 References , 2003 .

[12]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[13]  Wen-Hsiung Li,et al.  The K(A)/K(S) ratio test for assessing the protein-coding potential of genomic regions: an empirical and simulation study. , 2002, Genome research.

[14]  R. Jorgensen,et al.  Conserved Peptide Upstream Open Reading Frames are Associated with Regulatory Genes in Angiosperms , 2012, Front. Plant Sci..

[15]  Nicholas T. Ingolia,et al.  Ribosome Profiling of Mouse Embryonic Stem Cells Reveals the Complexity and Dynamics of Mammalian Proteomes , 2011, Cell.

[16]  Celine A. Hayden,et al.  Identification of novel conserved peptide uORF homology groups in Arabidopsis and rice reveals ancient eukaryotic origin of select groups and preferential association with transcription factor-encoding genes , 2007, BMC Biology.

[17]  Giovanni Bosco,et al.  Comparative genomic analysis of novel conserved peptide upstream open reading frames in Drosophila melanogaster and other dipteran species , 2008, BMC Genomics.

[18]  Z. Yang,et al.  Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. , 2000, Molecular biology and evolution.

[19]  Mark L Crowe,et al.  Evidence for conservation and selection of upstream open reading frames suggests probable encoding of bioactive peptides , 2006, BMC Genomics.

[20]  A Suyama,et al.  Statistical analysis of the 5' untranslated region of human mRNA using "Oligo-Capped" cDNA libraries. , 2000, Genomics.

[21]  C Saccone,et al.  Analysis of oligonucleotide AUG start codon context in eukariotic mRNAs. , 2000, Gene.

[22]  Feng-Chi Chen,et al.  Exploring the selective constraint on the sizes of insertions and deletions in 5' untranslated regions in mammals , 2011, BMC Evolutionary Biology.

[23]  Xue-Qing Wang,et al.  5'-untranslated regions with multiple upstream AUG codons can support low-level translation via leaky scanning and reinitiation. , 2004, Nucleic acids research.

[24]  Sumio Sugano,et al.  Analysis of small human proteins reveals the translation of upstream open reading frames of mRNAs. , 2004, Genome research.

[25]  Igor B. Rogozin,et al.  Evolutionary conservation suggests a regulatory function of AUG triplets in 5′-UTRs of eukaryotic genes , 2005, Nucleic acids research.

[26]  Graziano Pesole,et al.  uAUG and uORFs in human and rodent 5'untranslated mRNAs. , 2005, Gene.