A database of locus-specific databases

To the Editor: Complete and accurate information on genetic mutations and their effects on patients is essential for proper genetic healthcare. This realization led a group of prominent human geneticists to propose a federation of Locus-Specific Database (LSDB) curators as the best mode of collecting and curating accurate lists of mutations1. This subsequently led to the formation of the Mutation Database Initiative (MDI)2 under the auspices of the Human Genome Organisation (HUGO); MDI later became a society now known as the Human Genome Variation Society (HGVS)3,4. Key activities aimed at collecting mutations were initiated, including encouraging collection of information (the first step in creating a database) by inviting reviews of mutations in genes for the journal Human Mutation, creating guidelines for nomenclature of mutations5–8, initiating quality control of LSDB content9,10 and specifying the minimum content of LSDBs11,12. More recently, the content of 100 representative LSDBs was published, leading to further recommendations for content13 and a recommended form, published by members of the initiative, for submitting mutations to LSDBs (ref. 14 and http://www.hgvs.org/entry.html). As a result of this activity, the number of LSDBs grew to 83 by 2002. More recently, customized software has been made available to assist new curators (for example, LOVD15 and UMD16). Documentation of LSDBs as an aid to research and clinical care began, and a listing was posted on the HUGO/MDI website (now the HGVS website; http://www.hgvs.org) in early 1998 containing 209 databases; this listing was later published17. The listing has grown over the years, making it increasingly difficult to maintain; thus, a new database of LSDBs was created as a relational database on a MySQL database platform (http://www.mysql.com) to make curation of these sites easier. In January 2006, a program was initiated to update and add unlisted LSDBs. Dead links were investigated, and curators were contacted to create new links. This process led to the permanent deletion of four LSDBs and the addition of 176 more LSDBs from various sources, 75 of which were from the Retina International Scientific Newsletter Mutation Databases (see below) and 72 from the IMT Bioinformatic Groups Mutation Databases (see below). The latter two sets could perhaps be called aggregated databases. With Retina International, it appears that the databases are derived directly from the literature. The latest listing (9 March 2007) now includes 672 LSDBs and is likely to grow (http://www.hgvs. org/dblist/glsdb.html). This number represents 32% of genes in which at least one mutation has been reported (according to the Human Gene Mutation Database (HGMD); 2,056 genes, as of 9 March 2007). Beyond the information displayed on the HGVS website, the LSDB database includes gene-specific links to outside databases to aid in curation (EMBL’s Ensembl (http://www. ensembl.org), the HUGO Gene Nomenclature Committee’s gene nomenclature database (http://www.gene.ucl.ac.uk/nomenclature) and NCBI’s Entrez Gene (http://www.ncbi.nlm.nih. gov/entrez/query.fcgi?db=gene)). LSDBs are important because (i) curation by experts on the genes under consideration is hugely superior to that which can be given by experts on databases that collect mutations on all genes, such as OMIM18 and HGMD19, and (ii) curators are generally able to collect unpublished mutations from their laboratories, collaborators and abstracts, and unpublished mutations can represent as much as 50% of the total20. These LSDBs are extremely useful—indeed, vital—to research and proper healthcare, but there is usually little or no funding for this activity. This has led to a number of databases not being updated recently or even being withdrawn. A mechanism clearly needs to be found to prevent this loss of data and data collection. If there are any LSDBs in existence that do not appear on the list, the authors would be pleased to hear of them.

[1]  N. Dracopoli,et al.  Current protocols in human genetics , 1994 .

[2]  R G H Cotton,et al.  The HUGO Mutation Database Initiative , 1998, The Pharmacogenomics Journal.

[3]  H. Lehväslaiho,et al.  Guidelines and recommendations for content, structure, and deployment of mutation databases , 1999, Human mutation.

[4]  S. Antonarakis,et al.  Mutation nomenclature extensions and suggestions to describe complex mutations: A discussion , 2000 .

[5]  J. D. den Dunnen,et al.  Standardizing mutation nomenclature: Why bother? , 2003, Human mutation.

[6]  C R Scriver,et al.  Guidelines and recommendations for content, structure, and deployment of mutation databases: II. Journey in progress , 2000, Human mutation.

[7]  P. Stenson,et al.  Human Gene Mutation Database (HGMD®): 2003 update , 2003, Human mutation.

[8]  S. Antonarakis Recommendations for a nomenclature system for human gene mutations , 1998 .

[9]  Thierry Soussi,et al.  UMD (Universal Mutation Database): 2005 update , 2005, Human mutation.

[10]  C. Scriver,et al.  The Metabolic and Molecular Bases of Inherited Disease, 8th Edition 2001 , 2001, Journal of Inherited Metabolic Disease.

[11]  R. Cotton,et al.  Progress of the HUGO Mutation Database Initiative: A brief introduction to the Human Mutation MDI Special Issue , 2000, Human mutation.

[12]  Ourania Horaitis,et al.  Time for a unified system of mutation description and reporting: a review of locus-specific mutation databases. , 2002, Genome research.

[13]  L. Tsui,et al.  A suggested nomenclature for designating mutations , 1993, Human mutation.

[14]  C R Scriver,et al.  Proof of “disease causing” mutation , 1998, Human mutation.

[15]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2004, Nucleic Acids Res..

[16]  R. Cotton,et al.  Quality control in the discovery, reporting, and recording of genomic variation , 2000, Human mutation.

[17]  R. G. H. Cotton,et al.  The HUGO Mutation Database Initiative , 1998, Science.

[18]  David Neil Cooper,et al.  Nature encyclopedia of the human genome , 2003 .

[19]  I. Fokkema,et al.  LOVD: Easy creation of a locus‐specific sequence variation database using an “LSDB‐in‐a‐box” approach , 2005, Human mutation.