Analysis of small human proteins reveals the translation of upstream open reading frames of mRNAs.

To find novel short coding sequences from accumulated full-length cDNA sequences, proteomic analysis of small proteins expressed in human leukemia K562 cells was performed using high-resolution nanoflow liquid chromatography coupled with electrospray ionization tandem mass spectrometry. Our analysis led to the identification of 54 proteins not more than 100 amino acids in length, including four novel ones. These novel short coding sequences were all located upstream of the longest open reading frame (ORF) of the corresponding cDNA. Our findings indicate that the translation of short ORFs occurs in vivo whether or not there exists a longer coding region in the downstream of the mRNA. This investigation provides the first direct evidence of translation of upstream ORFs in human cells, which could greatly change the current outline of the human proteome.

[1]  Y. Suzuki,et al.  Construction and characterization of a full length-enriched and a 5'-end-enriched cDNA library. , 1997, Gene.

[2]  Donna R. Maglott,et al.  RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[3]  M. Kozak Initiation of translation in prokaryotes and eukaryotes. , 1999, Gene.

[4]  D. Morris,et al.  Upstream Open Reading Frames as Regulators of mRNA Translation , 2000, Molecular and Cellular Biology.

[5]  Kenta Nakai,et al.  DBTSS: DataBase of Human Transcriptional Start Sites and Full-Length cDNA , 2001 .

[6]  A Suyama,et al.  Statistical analysis of the 5' untranslated region of human mRNA using "Oligo-Capped" cDNA libraries. , 2000, Genomics.

[7]  M. Kozak An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs. , 1987, Nucleic acids research.

[8]  K. Nakai,et al.  Small open reading frames in 5' untranslated regions of mRnas. , 2003, Comptes rendus biologies.

[9]  A. Shevchenko,et al.  Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels. , 1996, Analytical chemistry.

[10]  Tohru Natsume,et al.  A direct nanoflow liquid chromatography-tandem mass spectrometry system for interaction proteomics. , 2002, Analytical chemistry.

[11]  N. Nomura,et al.  Complete sequencing and characterization of 21,243 full-length human cDNAs , 2004, Nature Genetics.

[12]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[13]  F. Soncin,et al.  VE‐statin, an endothelial repressor of smooth muscle cell migration , 2003, The EMBO journal.

[14]  A Suyama,et al.  Diverse transcriptional initiation revealed by fine, large‐scale mapping of mRNA start sites , 2001, EMBO reports.

[15]  A. Pandey,et al.  A reassessment of the translation initiation codon in vertebrates. , 2001, Trends in genetics : TIG.

[16]  M. Kozak The scanning model for translation: an update , 1989, The Journal of cell biology.

[17]  Klaus Zerres,et al.  Identification of a candidate modifying gene for spinal muscular atrophy by comparative genomics , 1998, Nature Genetics.

[18]  Kenta Nakai,et al.  DBTSS: DataBase of human Transcriptional Start Sites and full-length cDNAs , 2002, Nucleic Acids Res..

[19]  C. Watson,et al.  5′UTR sequences of the glucocorticoid receptor 1A transcript encode a peptide associated with translational regulation of the glucocorticoid receptor , 2001, Journal of cellular biochemistry.

[20]  L. Maquat Nonsense-mediated mRNA decay: splicing, translation and mRNP dynamics , 2004, Nature Reviews Molecular Cell Biology.

[21]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[22]  H. Meijer,et al.  Control of eukaryotic protein synthesis by upstream open reading frames in the 5'-untranslated region of an mRNA. , 2002, The Biochemical journal.