Interactive Discriminative Mining of Chemical Fragments

Structural activity prediction is one of the most important tasks in chemoinformatics. The goal is to predict a property of interest given structural data on a set of small compounds or drugs. Ideally, systems that address this task should not just be accurate, but they should also be able to identify an interpretable discriminative structure which describes the most discriminant structural elements with respect to some target. The application of ILP in an interactive software for discriminative mining of chemical fragments is presented in this paper. In particular, it is described the coupling of an ILP system with a molecular visualisation software that allows a chemist to graphically control the search for interesting patterns in chemical fragments. Furthermore, we show how structural information, such as rings, functional groups such as carboxyls, amines, methyls, and esters, are integrated and exploited in the search.

[1]  Francesca A. Lisi,et al.  Object Identity as Search Bias for Pattern Spaces , 2002, ECAI.

[2]  Luc De Raedt,et al.  Predictive Graph Mining , 2004, Discovery Science.

[3]  Joost N. Kok,et al.  Frequent graph mining and its application to molecular databases , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[4]  Ingrid Russell,et al.  Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference, May 1-5, 1999, Orlando, Florida, USA , 1999, FLAIRS Conference.

[5]  Nuno A. Fonseca,et al.  LogCHEM: Interactive Discriminative Mining of Chemical Structure , 2008, 2008 IEEE International Conference on Bioinformatics and Biomedicine.

[6]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[7]  Egon L. Willighagen,et al.  The Blue Obelisk—Interoperability in Chemical Informatics , 2006, J. Chem. Inf. Model..

[8]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[9]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[10]  K Schulten,et al.  VMD: visual molecular dynamics. , 1996, Journal of molecular graphics.

[11]  Lawrence B. Holder,et al.  Applying the Subdue Substructure Discovery System to the Chemical Toxicity Domain , 1999, FLAIRS Conference.

[12]  Ashwin Srinivasan,et al.  ILP: A Short Look Back and a Longer Look Forward , 2003, J. Mach. Learn. Res..

[13]  Stefan Kramer,et al.  Large-scale graph mining using backbone refinement classes , 2009, KDD.

[14]  Michael S. Lajiness,et al.  A Practical Strategy for Directed Compound Acquisition , 2005 .

[15]  Ann M Richard,et al.  Distributed structure-searchable toxicity (DSSTox) public database network: a proposal. , 2002, Mutation research.

[16]  David Page,et al.  ILP: Just Do It , 2000, Computational Logic.

[17]  Nuno A. Fonseca,et al.  April - An Inductive Logic Programming System , 2006, JELIA.

[18]  Frank Wolter,et al.  Semi-qualitative Reasoning about Distances: A Preliminary Report , 2000, JELIA.

[19]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[20]  Luc De Raedt,et al.  Molecular feature mining in HIV data , 2001, KDD '01.

[21]  Michael Luck,et al.  Proceedings of the IEEE International Conference on Systems, Man Cybernetics , 2004 .