"Good Annotation Practice" for Chemical Data in Biology

A structural diagram, in the form of a two-dimensional (2-D) sketch, remains the most effective portrait of a "small molecule" or chemical reaction. However, such structural diagrams, as for any other core data, cannot be used in speech (and should not be used in free text). "Good annotation practice" for biological databases is to use either consistent and widely recognised terminology or unique identifiers from a dedicated database to refer to the molecule of interest. Ideally, scientists should use terminology that is both pronounceable and meaningful. Thus, a viable solution for a bioinformatician is to use a definitive controlled vocabulary of biochemical compounds and reactions, which contains both systematic and common names. In addition, chemical ontologies provide a means for placing entities of interest into wider chemical, biological or medical contexts. We present some challenges and achievements in the standardisation of chemical language in biological databases, with emphasis on three aspects of annotation: 1. good drawing practice: how to draw unambiguous 2-D diagrams; 2. good naming practice: how to give most appropriate names; and 3. good ontology practice: how to link the entity of interest by defined logical relationships to other entities.

[1]  Michel Dumontier,et al.  CO: A chemical ontology for identification of functional groups and semantic comparison of small molecules , 2005, FEBS letters.

[2]  Luisa Montecchi-Palazzi,et al.  The PSI-MOD community standard for representation of protein modification data , 2008, Nature Biotechnology.

[3]  J. Powers,et al.  Reactivity of human leukocyte elastase and porcine pancreatic elastase toward peptide 4-nitroanilides containing model desmosine residues. Evidence that human leukocyte elastase is selective for cross-linked regions of elastin. , 1981, Biochemistry.

[4]  David W. Russell,et al.  LMSD: LIPID MAPS structure database , 2006, Nucleic Acids Res..

[5]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[6]  Alan Mcnaught,et al.  The IUPAC international chemical identifier : InChl-A new standard for molecular informatics , 2006 .

[7]  Dan Wu,et al.  EMBL Nucleotide Sequence Database in 2006 , 2006, Nucleic Acids Res..

[8]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2000, Nucleic Acids Res..

[9]  Henry S. Rzepa,et al.  Communication and re-use of chemical information in bioscience , 2005, BMC Bioinformatics.

[10]  Asunción Gómez-Pérez,et al.  Building a chemical ontology using Methontology and the Ontology Design Environment , 1999, IEEE Intell. Syst..

[11]  J. Brecher Graphical representation of stereochemical configuration (IUPAC Recommendations 2006) , 2006 .

[12]  Oliviero Carugo,et al.  The evolution of structural databases. , 2002, Trends in biotechnology.

[13]  Olivier Bodenreider,et al.  Bio-ontologies: current trends and future directions , 2006, Briefings Bioinform..

[14]  Matthew Hayden,et al.  Oxford Dictionary of Biochemistry and Molecular Biology , 2001, The Yale Journal of Biology and Medicine.

[15]  John S Garavelli,et al.  The RESID Database of Protein Modifications as a resource and annotation tool , 2004, Proteomics.

[16]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[17]  C. Glass,et al.  A comprehensive classification system for lipids. , 2005, Journal of lipid research.

[18]  Jonathan Brecher Graphical representation standards for chemical structure diagrams (IUPAC Recommendations 2008) , 2008 .

[19]  Punnaivanam Sankar,et al.  Design and Development of Chemical Ontologies for Reaction Representation , 2006, J. Chem. Inf. Model..

[20]  Amos Bairoch,et al.  Annotation of post‐translational modifications in the Swiss‐Prot knowledge base , 2004, Proteomics.

[21]  P. N. Campbell,et al.  Oxford dictionary of biochemistry and molecular biology. Revised edition. , 2000 .

[22]  Wendy A. Warr,et al.  Chemical Structures , 1988 .