Toxicology Analysis by Means of the JSM-method

MOTIVATION A model for learning potential causes of toxicity from positive and negative examples and predicting toxicity for the dataset used in the Predictive Toxicology Challenge (PTC) is presented. The learning model assumes that the causes of toxicity can be given as substructures common to positive examples that are not substructures of negative examples. This assumption results in the choice of a learning model, called the JSM-method, and a language for representing chemical compounds, called the Fragmentary Code of Substructure Superposition (FCSS). By means of the latter, chemical compounds are represented as sets of substructures which are 'biologically meaningful' from the expert point of view. RESULTS The chosen learning model and representation language show comparatively good performance for the PTC dataset: for three sex/species groups the predictions were ROC optimal, for one group the prediction was nearly optimal. The predictions tend to be conservative (few predictions and almost no errors), which can be explained by the specific features of the learning model. AVAILABILITY by request to finn@viniti.ru; serge@viniti.ru, http://ki-www2.intellektik.informatik.tu-darmstadt.de/~jsm/QDA.