Automated compound classification using a chemical ontology

BackgroundClassification of chemical compounds into compound classes by using structure derived descriptors is a well-established method to aid the evaluation and abstraction of compound properties in chemical compound databases. MeSH and recently ChEBI are examples of chemical ontologies that provide a hierarchical classification of compounds into general compound classes of biological interest based on their structural as well as property or use features. In these ontologies, compounds have been assigned manually to their respective classes. However, with the ever increasing possibilities to extract new compounds from text documents using name-to-structure tools and considering the large number of compounds deposited in databases, automated and comprehensive chemical classification methods are needed to avoid the error prone and time consuming manual classification of compounds.ResultsIn the present work we implement principles and methods to construct a chemical ontology of classes that shall support the automated, high-quality compound classification in chemical databases or text documents. While SMARTS expressions have already been used to define chemical structure class concepts, in the present work we have extended the expressive power of such class definitions by expanding their structure-based reasoning logic. Thus, to achieve the required precision and granularity of chemical class definitions, sets of SMARTS class definitions are connected by OR and NOT logical operators. In addition, AND logic has been implemented to allow the concomitant use of flexible atom lists and stereochemistry definitions. The resulting chemical ontology is a multi-hierarchical taxonomy of concept nodes connected by directed, transitive relationships.ConclusionsA proposal for a rule based definition of chemical classes has been made that allows to define chemical compound classes more precisely than before. The proposed structure-based reasoning logic allows to translate chemistry expert knowledge into a computer interpretable form, preventing erroneous compound assignments and allowing automatic compound classification. The automated assignment of compounds in databases, compound structure files or text documents to their related ontology classes is possible through the integration with a chemical structure search engine. As an application example, the annotation of chemical structure files with a prototypic ontology is demonstrated.

[1]  Michal Linial,et al.  ARISTO: ontological classification of small molecules by electron ionization-mass spectrometry , 2011, Nucleic Acids Res..

[2]  John E. Gordon Chemical inference. 3. Formalization of the language of relational chemistry: ontology and algebra , 1988, J. Chem. Inf. Comput. Sci..

[3]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[4]  Amit P. Sheth,et al.  Modular Ontology Design Using Canonical Building Blocks in the Biochemistry Domain , 2006, FOIS.

[5]  C. Steinbeck,et al.  The Chemical Information Ontology: Provenance and Disambiguation for Chemical Data on the Biological Semantic Web , 2011, PloS one.

[6]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[7]  Alexander Wlodawer,et al.  Chemical compound navigator: A web‐based chem‐BLAST, chemical taxonomy‐based search engine for browsing compounds , 2006, Proteins.

[8]  Christoph Steinbeck,et al.  Self-organizing ontology of biochemically relevant small molecules , 2011, BMC Bioinformatics.

[9]  Robert Stevens,et al.  Structure-based classification and ontology in chemistry , 2012, Journal of Cheminformatics.

[10]  Peter Murray-Rust,et al.  Chemistry for everyone , 2008, Nature.

[11]  Egon L. Willighagen,et al.  OSCAR4: a flexible architecture for chemical text-mining , 2011, J. Cheminformatics.

[12]  David Weininger,et al.  CHORTLES: A Method for Representing Oligomeric and Template-Based Mixtures , 1995, J. Chem. Inf. Comput. Sci..

[13]  Punnaivanam Sankar,et al.  Model Tool to Describe Chemical Structures in XML Format Utilizing Structural Fragments and Chemical Ontology , 2010, J. Chem. Inf. Model..

[14]  Michel Dumontier,et al.  CO: A chemical ontology for identification of functional groups and semantic comparison of small molecules , 2005, FEBS letters.

[15]  Pritish Kumar Varadwaj,et al.  FGO: A novel ontology for identification of ligand functional group , 2007, Bioinformation.

[16]  Christoph Steinbeck,et al.  Chemical Entities of Biological Interest: an update , 2009, Nucleic Acids Res..

[17]  Ansgar Schuffenhauer,et al.  Rule‐Based Classification of Chemical Structures by Scaffold , 2011, Molecular informatics.

[18]  Lutz Weber,et al.  Nitrogen‐15 NMR, 2D NMR and ESCA characterization of a new stable 6a‐thia(SIV)‐1,6‐diazapentalene , 1990 .