Efficient learning and evaluation of complex concepts in inductive logic programming

Inductive Logic Programming (ILP) is a subfield of Machine Learning with foundations in logic programming. In ILP, logic programming, a subset of first-order logic, is used as a uniform representation language for the problem specification and induced theories. ILP has been successfully applied to many real-world problems, especially in the biological domain (e.g. drug design, protein structure prediction), where relational information is of particular importance. The expressiveness of logic programs grants flexibility in specifying the learning task and understandability to the induced theories. However, this flexibility comes at a high computational cost, constraining the applicability of ILP systems. Constructing and evaluating complex concepts remain two of the main issues that prevent ILP systems from tackling many learning problems. These learning problems are interesting both from a research perspective, as they raise the standards for ILP systems, and from an application perspective, where these target concepts naturally occur in many real-world applications. Such complex concepts cannot be constructed or evaluated by parallelizing existing top-down ILP systems or improving the underlying Prolog engine. Novel search strategies and cover algorithms are needed. The main focus of this thesis is on how to efficiently construct and evaluate complex hypotheses in an ILP setting. In order to construct such hypotheses we investigate two approaches. The first, the Top Directed Hypothesis Derivation framework, implemented in the ILP system TopLog, involves the use of a top theory to constrain the hypothesis space. In the second approach we revisit the bottom-up search strategy of Golem, lifting its restriction on determinate clauses which had rendered Golem inapplicable to many key areas. These developments led to the bottom-up ILP system ProGolem. A challenge that arises with a bottom-up approach is the coverage computation of long, non-determinate, clauses. Prolog’s SLD-resolution is no longer adequate. We developed a new, Prolog-based, theta-subsumption engine which is significantly more efficient than SLD-resolution in computing the coverage of such complex clauses. We provide evidence that ProGolem achieves the goal of learning complex concepts by presenting a protein-hexose binding prediction application. The theory ProGolem induced has a statistically significant better predictive accuracy than that of other learners. More importantly, the biological insights ProGolem’s theory provided were judged by domain experts to be relevant and, in some cases, novel.

[1]  Shan-Hwei Nienhuys-Cheng,et al.  Foundations of Inductive Logic Programming , 1997, Lecture Notes in Computer Science.

[2]  Peter A. Flach,et al.  Rule Evaluation Measures: A Unifying View , 1999, ILP.

[3]  Stephen Muggleton,et al.  QG/GA: a stochastic search for Progol , 2008, Machine Learning.

[4]  Stephen Muggleton,et al.  Subsumer: A Prolog theta-subsumption engine , 2010, ICLP.

[5]  Roni Khardon,et al.  Bottom-Up ILP Using Large Refinement Steps , 2004, ILP.

[6]  Jean-Philippe Vert,et al.  The Pharmacophore Kernel for Virtual Screening with Support Vector Machines , 2006, J. Chem. Inf. Model..

[7]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[8]  Luc De Raedt,et al.  Probabilistic Inductive Logic Programming - Theory and Applications , 2008, Probabilistic Inductive Logic Programming.

[9]  Alessandra Russo,et al.  Inductive Logic Programming as Abductive Search , 2010, ICLP.

[10]  Marko Robnik-Sikonja,et al.  Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF , 2004, Applied Intelligence.

[11]  J. A. Robinson,et al.  A Machine-Oriented Logic Based on the Resolution Principle , 1965, JACM.

[12]  Gerson Zaverucha,et al.  Learning Logic Programs with Neural Networks , 2001, ILP.

[13]  Stephen Muggleton,et al.  Learning optimal chess strategies , 1994, Machine Intelligence 13.

[14]  Nils J. Nilsson,et al.  Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Anderson Faustino da Silva,et al.  The Design of the YAP Compiler: An Optimizing Compiler for Logic Programming Languages , 2006, J. Univers. Comput. Sci..

[16]  J. Ross Quinlan,et al.  Learning logical definitions from relations , 1990, Machine Learning.

[17]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[18]  Hassan Al-Ali,et al.  Prediction of protein‐glucose binding sites using support vector machines , 2009, Proteins.

[19]  Stephen Muggleton,et al.  TopLog: ILP Using a Logic Program Declarative Bias , 2008, ICLP.

[20]  Robert A. Kowalski,et al.  Predicate Logic as Programming Language , 1974, IFIP Congress.

[21]  S. Muggleton Stochastic Logic Programs , 1996 .

[22]  Stephen Muggleton,et al.  ProGolem: A System Based on Relative Minimal Generalisation , 2009, ILP.

[23]  Ann M Richard,et al.  Distributed structure-searchable toxicity (DSSTox) public database network: a proposal. , 2002, Mutation research.

[24]  Ashwin Srinivasan,et al.  ILP: A Short Look Back and a Longer Look Forward , 2003, J. Mach. Learn. Res..

[25]  William W. Cohen Grammatically Biased Learning: Learning Logic Programs Using an Explicit Antecedent Description Language , 1994, Artif. Intell..

[26]  Ryszard S. Michalski,et al.  Inductive inference of VL decision rules , 1977, SGAR.

[27]  J. R. Quinlan Induction of decision trees , 2004, Machine Learning.

[28]  Stephen Muggleton,et al.  When Does It Pay Off to Use Sophisticated Entailment Engines in ILP? , 2010, ILP.

[29]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[30]  Ashwin Srinivasan,et al.  Query Transformations for Improving the Efficiency of ILP Systems , 2003, J. Mach. Learn. Res..

[31]  Bo Liu,et al.  IR-spectral signatures of aromatic-sugar complexes: probing carbohydrate-protein interactions. , 2007, Angewandte Chemie.

[32]  Petety V Balaji,et al.  Identification of common structural features of binding sites in galactose‐specific proteins , 2004, Proteins.

[33]  Peter A. Flach,et al.  Subgroup Discovery with CN2-SD , 2004, J. Mach. Learn. Res..

[34]  Richard A. Lewis,et al.  Drug design by machine learning: the use of inductive logic programming to model the structure-activity relationships of trimethoprim analogues binding to dihydrofolate reductase. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Jörg-Uwe Kietz,et al.  An Efficient Subsumption Algorithm for Inductive Logic Programming , 1994, ICML.

[36]  Philip J. Stone,et al.  Experiments in induction , 1966 .

[37]  Janet M. Thornton,et al.  Protein fold recognition , 1993, J. Comput. Aided Mol. Des..

[38]  J. Lloyd Foundations of Logic Programming , 1984, Symbolic Computation.

[39]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[40]  Michèle Sebag,et al.  Constraint-based Learning of Long Relational Concepts , 2002, ICML.

[41]  Ashwin Srinivasan,et al.  Pharmacophore Discovery Using the Inductive Logic Programming System PROGOL , 1998, Machine Learning.

[42]  D. Bolam,et al.  Carbohydrate-binding modules: fine-tuning polysaccharide recognition. , 2004, The Biochemical journal.

[43]  A. Roche,et al.  Organic Chemistry: , 1982, Nature.

[44]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[45]  Konstantinos Sagonas,et al.  Demand-Driven Indexing of Prolog Clauses , 2007, ICLP.

[46]  Ondrej Kuzelka,et al.  A Restarted Strategy for Efficient Subsumption Testing , 2008, Fundam. Informaticae.

[47]  Michèle Sebag,et al.  Tractable Induction and Classification in First Order Logic Via Stochastic Matching , 1997, IJCAI.

[48]  Ashwin Srinivasan,et al.  Randomised restarted search in ILP , 2006, Machine Learning.

[49]  Ljupco Todorovski,et al.  Learning Declarative Bias , 2007, ILP.

[50]  David E. Smith,et al.  Ordering Conjunctive Queries , 1985, Artif. Intell..

[51]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[52]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[53]  Stephen Muggleton,et al.  An ILP System for Learning Head Output Connected Predicates , 2009, EPIA.

[54]  Ashwin Srinivasan,et al.  Relating chemical activity to structure: An examination of ILP successes , 1995, New Generation Computing.

[55]  Bart Demoen,et al.  Improving the Efficiency of Inductive Logic Programming Through the Use of Query Packs , 2011, J. Artif. Intell. Res..

[56]  Ivan Bratko,et al.  Applications of inductive logic programming , 1995, CACM.

[57]  Paliath Narendran,et al.  NP-Completeness of the Set Unification and Matching Problems , 1986, CADE.

[58]  Saso Dzeroski,et al.  Discovering dynamics: From inductive logic programming to machine discovery , 1993, Journal of Intelligent Information Systems.

[59]  Rui Camacho,et al.  Nuno Alberto Paulino da Fonseca Parallelism in Inductive Logic Programming Systems , 2006 .

[60]  P. Qasba,et al.  Architecture of the sugar binding sites in carbohydrate binding proteins--a computer modeling study. , 1998, International journal of biological macromolecules.

[61]  Keith Brew,et al.  Roles of individual enzyme-substrate interactions by alpha-1,3-galactosyltransferase in catalysis and specificity. , 2003, Biochemistry.

[62]  Michèle Sebag,et al.  Fast Theta-Subsumption with Constraint Satisfaction Algorithms , 2004, Machine Learning.

[63]  Ehud Shapiro,et al.  Algorithmic Program Debugging , 1983 .

[64]  Smadar T. Kedar-Cabelli,et al.  Explanation-Based Generalization as Resolution Theorem Proving , 1987 .

[65]  Ashwin Srinivasan,et al.  Carcinogenesis Predictions Using ILP , 1997, ILP.

[66]  G. Plotkin Automatic Methods of Inductive Inference , 1972 .

[67]  Lorenza Saitta,et al.  Phase Transitions in Relational Learning , 2000, Machine Learning.

[68]  J M Thornton,et al.  Analysis and prediction of carbohydrate binding sites. , 2000, Protein engineering.

[69]  R. Kowalski,et al.  Linear Resolution with Selection Function , 1971 .

[70]  Luc De Raedt,et al.  nFOIL: Integrating Naïve Bayes and FOIL , 2005, AAAI.

[71]  P. Whittle Multi‐Armed Bandits and the Gittins Index , 1980 .

[72]  Luc De Raedt,et al.  A Theory of Clausal Discovery , 1993, IJCAI.

[73]  Ben Taskar,et al.  Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) , 2007 .

[74]  Fritz Wysotzki,et al.  Efficient Theta-Subsumption Based on Graph Algorithms , 1996, Inductive Logic Programming Workshop.

[75]  Michèle Sebag,et al.  Relational Learning as Search in a Critical Region , 2003, J. Mach. Learn. Res..

[76]  Stephen Muggleton,et al.  Learning probabilistic logic models from probabilistic examples , 2007, Machine Learning.

[77]  S. Muggleton,et al.  Protein secondary structure prediction using logic-based machine learning. , 1992, Protein engineering.

[78]  Henrik Bostr,et al.  Specialization of Logic Programs by Pruning SLD-Trees , 2007 .

[79]  Stephen Muggleton,et al.  The lattice structure and refinement operators for the hypothesis space bounded by a bottom clause , 2009, Machine Learning.

[80]  R. Mooney,et al.  Explanation-Based Learning: An Alternative View , 1986, Machine Learning.

[81]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .