AND Parallelism for ILP: The APIS System

Inductive Logic Programming (ILP) is a well known approach to Multi-Relational Data Mining. ILP systems may take a long time for analyzing the data mainly because the search (hypotheses) spaces are often very large and the evaluation of each hypothesis, which involves theorem proving, may be quite time consuming in some domains. To address these efficiency issues of ILP systems we propose the APIS (And ParallelISm for ILP) system that uses results from Logic Programming AND-parallelism. The approach enables the partition of the search space into sub-spaces of two kinds: sub-spaces where clause evaluation requires theorem proving; and sub-spaces where clause evaluation is performed quite efficiently without resorting to a theorem prover. We have also defined a new type of redundancy (Coverage-equivalent redundancy) that enables the prune of significant parts of the search space. The new type of pruning together with the partition of the hypothesis space considerably improved the performance of the APIS system. An empirical evaluation of the APIS system in standard ILP data sets shows considerable speedups without a lost of accuracy of the models constructed.

[1]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[2]  Peter Schachte,et al.  Estimating the overlap between dependent computations for automatic parallelization , 2011, Theory Pract. Log. Program..

[3]  Ricardo Rocha,et al.  Threads and or-parallelism unified , 2010, Theory Pract. Log. Program..

[4]  Ashwin Srinivasan,et al.  Parallel ILP for distributed-memory architectures , 2009, Machine Learning.

[5]  Stephen Muggleton,et al.  Relational Rule Induction with CProgol4.4: A Tutorial Introduction , 2001 .

[6]  Jan Wielemaker,et al.  Native Preemptive Threads in SWI-Prolog , 2003, ICLP.

[7]  Frank Wolter,et al.  Semi-qualitative Reasoning about Distances: A Preliminary Report , 2000, JELIA.

[8]  Nuno A. Fonseca,et al.  Improving the efficiency of inductive logic programming systems , 2009 .

[9]  Forum Mpi MPI: A Message-Passing Interface , 1994 .

[10]  Nuno A. Fonseca,et al.  April - An Inductive Logic Programming System , 2006, JELIA.

[11]  Krzysztof R. Apt,et al.  Logic Programming , 1990, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[12]  Nuno A. Fonseca,et al.  A Relational Learning Approach to Structure-Activity Relationships in Drug Design Toxicity Studies , 2011, J. Integr. Bioinform..

[13]  Rui Camacho IndLog - Induction in Logic , 2004, JELIA.

[14]  David B. Skillicorn,et al.  Parallel and Sequential Algorithms for Data Mining Using Inductive Logic , 2001, Knowledge and Information Systems.

[15]  Fumio Mizoguchi,et al.  Parallel Execution for Speeding Up Inductive Logic Programming Systems , 1999, Discovery Science.

[16]  Fumio Mizoguchi,et al.  Concurrent Execution of Optimal Hypothesis Search for Inverse Entailment , 2000, ILP.

[17]  Luc De Raedt,et al.  Parallel inductive logic programming , 1995 .

[18]  Mats Carlsson,et al.  Parallel execution of prolog programs: a survey , 2001, TOPL.

[19]  Vicki Dellarco,et al.  Use of mechanism-based structure-activity relationships analysis in carcinogenic potential ranking for drinking water disinfection by-products. , 2002, Environmental health perspectives.

[20]  Ashwin Srinivasan,et al.  Query Transformations for Improving the Efficiency of ILP Systems , 2003, J. Mach. Learn. Res..

[21]  Manuel V. Hermenegildo,et al.  A High-Level Implementation of Non-deterministic, Unrestricted, Independent And-Parallelism , 2008, ICLP.

[22]  Paulo Nunes,et al.  High-Level Multi-threading Programming in Logtalk , 2008, PADL.

[23]  Amanda Clare,et al.  Data Mining the Yeast Genome in a Lazy Functional Language , 2003, PADL.