Genenames.org: the HGNC and VGNC resources in 2021

Abstract The HUGO Gene Nomenclature Committee (HGNC) based at EMBL’s European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. There are over 42,000 approved gene symbols in our current database of which over 19 000 are for protein-coding genes. While we still update placeholder and problematic symbols, we are working towards stabilizing symbols where possible; over 2000 symbols for disease associated genes are now marked as stable in our symbol reports. All of our data is available at the HGNC website https://www.genenames.org. The Vertebrate Gene Nomenclature Committee (VGNC) was established to assign standardized nomenclature in line with human for vertebrate species lacking their own nomenclature committee. In addition to the previous VGNC core species of chimpanzee, cow, horse and dog, we now name genes in cat, macaque and pig. Gene groups have been added to VGNC and currently include two complex families: olfactory receptors (ORs) and cytochrome P450s (CYPs). In collaboration with specialists we have also named CYPs in species beyond our core set. All VGNC data is available at https://vertebrate.genenames.org/. This article provides an overview of our online data and resources, focusing on updates over the last two years.

[1]  Alex Bateman,et al.  RNAcentral: a hub of information for non-coding RNA sequences , 2018, Nucleic Acids Res..

[2]  Roberta A Pagon,et al.  GeneTests: an online genetic information resource for health care providers. , 2006, Journal of the Medical Library Association : JMLA.

[3]  Michael J. Lush,et al.  HCOP: a searchable database of human orthology predictions , 2006, Briefings Bioinform..

[4]  D. Cane,et al.  Structure of 4-diphosphocytidyl-2-C- methylerythritol synthetase involved in mevalonate- independent isoprenoid biosynthesis , 2001, Nature Structural Biology.

[5]  Astrid Gall,et al.  Ensembl 2020 , 2019, Nucleic Acids Res..

[6]  Sergio Contrino,et al.  InterMine: extensive web services for modern biology , 2014, Nucleic Acids Res..

[7]  D. Lancet,et al.  A unified nomenclature for vertebrate olfactory receptors , 2020, BMC Evolutionary Biology.

[8]  The UniProt Consortium,et al.  UniProt: a worldwide hub of protein knowledge , 2018, Nucleic Acids Res..

[9]  John M. Hancock,et al.  An open and transparent process to select ELIXIR Node Services as implemented by ELIXIR-UK. , 2016, F1000Research.

[10]  Paul Denny,et al.  Genenames.org: the HGNC and VGNC resources in 2019 , 2018, Nucleic Acids Res..

[11]  Tatiana A. Tatusova,et al.  Gene: a gene-centered information resource at NCBI , 2014, Nucleic Acids Res..

[12]  David Haussler,et al.  UCSC Genome Browser enters 20th year , 2019, Nucleic Acids Res..

[13]  The RNAcentral Consortium RNAcentral: a hub of information for non-coding RNA sequences , 2019, Nucleic Acids Res..

[14]  Judith A. Blake,et al.  Mouse Genome Database (MGD) 2019 , 2018, Nucleic Acids Res..

[15]  S. Zheng,et al.  Long noncoding RNA LERFS negatively regulates rheumatoid synovial aggression and proliferation , 2018, The Journal of clinical investigation.

[16]  Melissa J. Landrum,et al.  ClinVar: improvements to accessing data , 2019, Nucleic Acids Res..

[17]  Melinda R. Dwinell,et al.  The Year of the Rat: The Rat Genome Database at 20: a multi-species knowledgebase and analysis platform , 2019, Nucleic Acids Res..

[18]  Ana Kozomara,et al.  miRBase: from microRNA sequences to function , 2018, Nucleic Acids Res..

[19]  Heidi L Rehm,et al.  ClinGen--the Clinical Genome Resource. , 2015, The New England journal of medicine.

[20]  Patricia P. Chan,et al.  GtRNAdb 2.0: an expanded database of transfer RNA genes identified in complete and draft genomes , 2015, Nucleic Acids Res..

[21]  Caroline F. Wright,et al.  DECIPHER: database for the interpretation of phenotype-linked plausibly pathogenic sequence and copy-number variation , 2013, Nucleic Acids Res..

[22]  Mathew W. Wright,et al.  A review of the new HGNC gene family resource , 2016, Human Genomics.

[23]  James C. Wright,et al.  Discovery of high-confidence human protein-coding genes and exons by whole-genome PhyloCSF helps elucidate 118 GWAS loci , 2019, Genome research.

[24]  Ruth L. Seal,et al.  A guide to naming human non‐coding RNA genes , 2020, The EMBO journal.

[25]  Tsippi Iny Stein,et al.  The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses , 2016, Current protocols in bioinformatics.

[26]  J. Wikstrom,et al.  WAKMAR2, a Long Noncoding RNA Downregulated in Human Chronic Wounds, Modulates Keratinocyte Motility and Production of Inflammatory Chemokines. , 2019, The Journal of investigative dermatology.

[27]  Gaston H. Gonnet,et al.  The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces , 2017, Nucleic Acids Res..

[28]  Alan F. Scott,et al.  OMIM.org: leveraging knowledge across phenotype–gene relationships , 2018, Nucleic Acids Res..

[29]  Tudor Groza,et al.  The Monarch Initiative in 2019: an integrative data and analytic platform connecting phenotypes to genotypes across species , 2019, Nucleic Acids Res..

[30]  Anushya Muruganujan,et al.  PANTHER version 14: more genomes, a new PANTHER GO-slim and improvements in enrichment analysis tools , 2018, Nucleic Acids Res..

[31]  Massimiliano Izzo,et al.  FAIRsharing as a community approach to standards, repositories and policies , 2019, Nature Biotechnology.

[32]  Anushya Muruganujan,et al.  Alliance of Genome Resources Portal: unified model organism research platform , 2019, Nucleic Acids Res..

[33]  Ruth L. Seal,et al.  Guidelines for human gene nomenclature , 2020, Nature Genetics.