RefSeq curation and annotation of stop codon recoding in vertebrates

Abstract Recoding of stop codons as amino acid-specifying codons is a co-translational event that enables C-terminal extension of a protein. Synthesis of selenoproteins requires recoding of internal UGA stop codons to the 21st non-standard amino acid selenocysteine (Sec) and plays a vital role in human health and disease. Separately, canonical stop codons can be recoded to specify standard amino acids in a process known as stop codon readthrough (SCR), producing extended protein isoforms with potential novel functions. Conventional computational tools cannot distinguish between the dual functionality of stop codons as stop signals and sense codons, resulting in misannotation of selenoprotein gene products and failure to predict SCR. Manual curation is therefore required to correctly represent recoded gene products and their functions. Our goal was to provide accurately curated and annotated datasets of selenoprotein and SCR transcript and protein records to serve as annotation standards and to promote basic and biomedical research. Gene annotations were curated in nine vertebrate model organisms and integrated into NCBI’s Reference Sequence (RefSeq) dataset, resulting in 247 selenoprotein genes encoding 322 selenoproteins, and 93 genes exhibiting SCR encoding 94 SCR isoforms.

[1]  Alka A. Potdar,et al.  Programmed Translational Readthrough Generates Antiangiogenic VEGF-Ax , 2014, Cell.

[2]  D. Driscoll,et al.  Alternative Transcripts and 3′UTR Elements Govern the Incorporation of Selenocysteine into Selenoprotein S , 2013, PloS one.

[3]  J. Eppig Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse , 2017, ILAR journal.

[4]  Roderic Guigó,et al.  Selenoprofiles: profile-based scanning of eukaryotic genome sequences for selenoprotein genes , 2010, Bioinform..

[5]  Weisong Liu,et al.  The Rat Genome Database 2015: genomic, phenotypic and environmental variations and disease , 2014, Nucleic Acids Res..

[6]  O. Namy,et al.  Sense from nonsense: therapies for premature stop codon diseases. , 2012, Trends in molecular medicine.

[7]  M. Bölker,et al.  Ribosomal Readthrough at a Short UGA Stop Codon Context Triggers Dual Localization of Metabolic Enzymes in Fungi and Animals , 2014, PLoS genetics.

[8]  Monte Westerfield,et al.  The Zebrafish Information Network (ZFIN): the zebrafish model organism database , 2003, Nucleic Acids Res..

[9]  R. Guigó,et al.  Human selenoprotein P and S variant mRNAs with different numbers of SECIS elements and inferences from mutant mice of the roles of multiple SECIS elements , 2016, Open Biology.

[10]  Wen J. Li,et al.  RefSeq: an update on prokaryotic genome annotation and curation , 2017, Nucleic Acids Res..

[11]  L. Flohé,et al.  Gene disruption discloses role of selenoprotein P in selenium delivery to target tissues. , 2003, The Biochemical journal.

[12]  M. T. Howard,et al.  Translational Redefinition of UGA Codons Is Regulated by Selenium Availability* , 2013, The Journal of Biological Chemistry.

[13]  V. Gladyshev,et al.  Selenophosphate synthetase 2 is essential for selenoprotein biosynthesis , 2006, The Biochemical journal.

[14]  M. Trojano,et al.  Translational readthrough generates new astrocyte AQP4 isoforms that modulate supramolecular clustering, glial endfeet localization, and water transport , 2017, Glia.

[15]  Yan Zhang,et al.  Recode-2: new design, new search tools, and many more genes , 2009, Nucleic Acids Res..

[16]  Kim D. Pruitt,et al.  RefSeq curation and annotation of antizyme and antizyme inhibitor genes in vertebrates , 2015, Nucleic acids research.

[17]  Cloning, Sequencing, and Expression of Selenoprotein Transcripts in the Turkey (Meleagris gallopavo) , 2015, PloS one.

[18]  William Arbuthnot Sir Lane,et al.  Rabbit beta-globin is extended beyond its UGA stop codon by multiple suppressions and translational reading gaps. , 1998, Biochemistry.

[19]  H. Baba,et al.  Phylogenetically Conserved Sequences Around Myelin P0 Stop Codon are Essential for Translational Readthrough to Produce L-MPZ , 2017, Neurochemical Research.

[20]  J. F. Atkins,et al.  Stimulation of stop codon readthrough: frequent presence of an extended 3′ RNA structural element , 2011, Nucleic acids research.

[21]  Michael F. Lin,et al.  Evidence of abundant stop codon readthrough in Drosophila and other metazoa. , 2011, Genome research.

[22]  Kevin A. Burns,et al.  Genome evolution in the allotetraploid frog Xenopus laevis , 2016, Nature.

[23]  Vadim N. Gladyshev,et al.  How Selenium Has Altered Our Understanding of the Genetic Code , 2002, Molecular and Cellular Biology.

[24]  R. Guigó,et al.  Selenoprotein Gene Nomenclature* , 2016, The Journal of Biological Chemistry.

[25]  K. Kleene,et al.  Sequence of the gene encoding the mitochondrial capsule selenoprotein of mouse sperm: identification of three in-phase TGA selenocysteine codons. , 1992, DNA and cell biology.

[26]  Roderic Guigó,et al.  SelenoDB 2.0: annotation of selenoprotein genes in animals and their genetic diversity in humans , 2013, Nucleic Acids Res..

[27]  C. Campagnoni,et al.  L-MPZ, a Novel Isoform of Myelin P0, Is Produced by Stop Codon Readthrough* , 2012, The Journal of Biological Chemistry.

[28]  Guy Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2010, Nucleic Acids Res..

[29]  H. Beier,et al.  Misreading of termination codons in eukaryotes by natural nonsense suppressor tRNAs. , 2001, Nucleic acids research.

[30]  D. Maglott,et al.  The chicken gene nomenclature committee report , 2009, BMC Genomics.

[31]  M. T. Howard,et al.  Avoidance of reporter assay distortions from fused dual reporters , 2017, RNA.

[32]  S. Gygi,et al.  Regulation of Selenocysteine Content of Human Selenoprotein P by Dietary Selenium and Insertion of Cysteine in Place of Selenocysteine , 2015, PloS one.

[33]  Allan Jacobson,et al.  NMD: a multifaceted response to premature translational termination , 2012, Nature Reviews Molecular Cell Biology.

[34]  M. T. Howard,et al.  Recoding elements located adjacent to a subset of eukaryal selenocysteine‐specifying UGA codons , 2005, The EMBO journal.

[35]  Tom H. Pringle,et al.  Composition and Evolution of the Vertebrate and Mammalian Selenoproteomes , 2012, PloS one.

[36]  A. Rich,et al.  A UGA termination suppression tRNATrp active in rabbit reticulocytes , 1980, Nature.

[37]  K. Kleene,et al.  Developmental expression, intracellular localization, and selenium content of the cysteine‐rich protein associated with the mitochondrial capsules of mouse sperm , 1996, Molecular reproduction and development.

[38]  I. Brierley,et al.  Non-canonical translation in RNA viruses , 2012, The Journal of general virology.

[39]  Manolis Kellis,et al.  Evidence of efficient stop codon readthrough in four mammalian genes , 2014, Nucleic acids research.

[40]  Ying Wang,et al.  Xenbase: a genomic, epigenomic and transcriptomic model organism database , 2017, Nucleic Acids Res..

[41]  T. Lingner,et al.  Peroxisomal lactate dehydrogenase is generated by translational readthrough in mammals , 2014, eLife.

[42]  Yvonne M. Bradford,et al.  ZFIN, The zebrafish model organism database: Updates and new directions , 2015, Genesis.

[43]  P. Hoffmann,et al.  Endoplasmic reticulum-resident selenoproteins as regulators of calcium signaling and homeostasis. , 2017, Cell calcium.

[44]  Manolis Kellis,et al.  Stop codon readthrough generates a C-terminally extended variant of the human vitamin D receptor with reduced calcitriol response , 2018, The Journal of Biological Chemistry.

[45]  Guy Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2011, Nucleic Acids Res..

[46]  P. Guicheney,et al.  Selenoprotein N in skeletal muscle: from diseases to function , 2012, Journal of Molecular Medicine.

[47]  R. Caprioli,et al.  Mass Spectrometric Characterization of Full-length Rat Selenoprotein P and Three Isoforms Shortened at the C Terminus , 2002, The Journal of Biological Chemistry.

[48]  A. Meyer,et al.  Genome duplication, a trait shared by 22000 species of ray-finned fish. , 2003, Genome research.

[49]  V. Gladyshev,et al.  Selenium and selenocysteine: roles in cancer, health, and development. , 2014, Trends in biochemical sciences.

[50]  J. Harney,et al.  Mutation of the Secys residue 266 in human type 2 selenodeiodinase alters 75Se incorporation without affecting its biochemical properties. , 1999, Biochimie.

[51]  R. Guigó,et al.  SECISearch3 and Seblastian: new tools for prediction of SECIS elements and selenoproteins , 2013, Nucleic acids research.

[52]  A. Beggs,et al.  Selenoproteins and their impact on human health through diverse physiological pathways. , 2006, Physiology.

[53]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[54]  R. Guigó,et al.  Characterization of Mammalian Selenoproteomes , 2003, Science.

[55]  F. Ursini,et al.  Dual function of the selenoprotein PHGPx during sperm maturation. , 1999, Science.

[56]  Joshua G. Dunn,et al.  Ribosome profiling reveals pervasive and regulated stop codon readthrough in Drosophila melanogaster , 2013, eLife.