A Survey of mRNA Sequences with a Non-AUG Start Codon in RefSeq Database

Abstract Alternative initiation in translation is one of the important mechanisms in which multiple proteins are synthesized from a single mRNA. In many cases, translation initiation occurring at a non-AUG codon has been reported by several experimental studies. We have analyzed all mRNA sequences in the RefSeq database and found that coding regions of about 0.1% of the total mRNA sequences begin with a non-AUG codon (nonAUG mRNAs). Major fraction of non-AUG mRNAs is predicted from genomic sequences. More than 100 non-AUG sequences are highly curated and 52 of them are explicitly annotated that they use alternate start codons for translation initiation. Analysis of these sequences reveals that majority of the protein products contain domains that are DNA/RNA-binding, kinases, growth factors, or involved in immune response or cell proliferation. Thus, the proteins translated from non-canonical codons seem to be implicated in regulatory role and/or signaling mechanism. The sequence context of the non-AUG start codons shows that purine at −3 position and/or G at +4 position are strongly conserved and the corresponding genes give rise to alternate transcripts and/or multiple isoforms. We have also developed a database “nonAUG” (http://bioinfo.iitk.ac.in) that contains a collection of all mRNA sequences whose coding regions start with a non-AUG codon. nonAUG database will be continuously updated and is freely available to the scientific community.

[1]  C Saccone,et al.  Analysis of oligonucleotide AUG start codon context in eukariotic mRNAs. , 2000, Gene.

[2]  M. Kozak Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes , 1986, Cell.

[3]  M. Kozak,et al.  Regulation of translation via mRNA structure in prokaryotes and eukaryotes. , 2005, Gene.

[4]  M. Kozak,et al.  Pushing the limits of the scanning mechanism for initiation of translation , 2002, Gene.

[5]  M. Kozak Structural features in eukaryotic mRNAs that modulate the initiation of translation. , 1991, The Journal of biological chemistry.

[6]  Artemis G. Hatzigeorgiou,et al.  Translation initiation start prediction in human cDNAs with high accuracy , 2002, Bioinform..

[7]  Amos Bairoch,et al.  Recent improvements to the PROSITE database , 2004, Nucleic Acids Res..

[8]  M. Espagnol,et al.  Translation initiation by non-AUG codons in Arabidopsis thaliana transgenic plants , 2006, Plant Cell Reports.

[9]  Donna R. Maglott,et al.  RefSeq and LocusLink: NCBI gene-centered resources , 2001, Nucleic Acids Res..

[10]  M. Kozak,et al.  Recognition of AUG and alternative initiator codons is augmented by G in position +4 but is not generally affected by the nucleotides in positions +5 and +6 , 1997, The EMBO journal.

[11]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[12]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[13]  M. Kozak,et al.  Emerging links between initiation of translation and human diseases , 2002, Mammalian Genome.

[14]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[15]  Jo McEntyre,et al.  The NCBI Handbook , 2002 .

[16]  Luciano Milanesi,et al.  Presence of ATG triplets in 5' untranslated regions of eukaryotic cDNAs correlates with a 'weak' context of the start codon , 2001, Bioinform..

[17]  Yasuaki Oda,et al.  Evolutionarily conserved non-AUG translation initiation in NAT1/p97/DAP5 (EIF4G2). , 2005, Genomics.

[18]  M. Kozak An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs. , 1987, Nucleic acids research.

[19]  A. Kochetov AUG codons at the beginning of protein coding sequences are frequent in eukaryotic mRNAs with a suboptimal start codon context , 2005, Bioinform..

[20]  M. Kozak Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs. , 1984, Nucleic acids research.

[21]  A. Prats,et al.  Generation of protein isoform diversity by alternative initiation of translation at non‐AUG codons , 2003, Biology of the cell.

[22]  M. Mathews ForumLost in translation , 2002 .

[23]  A. Pandey,et al.  A reassessment of the translation initiation codon in vertebrates. , 2001, Trends in genetics : TIG.

[24]  M. Kozak The scanning model for translation: an update , 1989, The Journal of cell biology.