Towards Inference of a Biochemical Ontology From a Metabolic Database

In order to predict the metabolic fate of an arbitrary compound based solely on structure, it is useful to be able to identify substructural ‘functional groups’ that are biochemically reactive. These functional groups are the substructural elements that can be removed and replaced to transform one compound into another. This problem of identifying functional groups is related to the problem of classifying compounds. The research presented here discusses the state of the art in biochemical databases and how these sources may be applied to the problem of classifying compounds based solely on structure. We describe a biochemical informatics system for processing molecular data and describe how 100 255 compositional (hasA) relationships are inferred between 835 abstractions and 9500 metabolites from the KEGG Ligand database. Specifically, we focus on the identification of amino acids and consider ways in which the inference of biochemical ontologies for metabolites will be improved in the future.