Effective rule induction from labeled graphs

Labeled graphs provide a natural way of representing objects and the way they are connected. They have various applications in different fields, such as for example in computational chemistry. They can be represented by relational structures and thus stored in relational databases. Acyclic conjunctive queries form a practically relevant fragment of database queries that can be evaluated in polynomial time. We propose a top-down induction algorithm for learning acyclic conjunctive queries from labeled graphs represented by relational structures. The algorithm allows the use of building blocks which depend on the particular application considered. To compensate for the reduced expressive power of the hypothesis language and thus the potential loss in predictive performance, we combine acyclic conjunctive queries with confidence-rated boosting. In the empirical evaluation of the method we show that it leads to excellent prediction accuracy on the domain of mutagenicity.

[1]  Stefan Wrobel Inductive Logic Programming , 1996, Lecture Notes in Computer Science.

[2]  Phokion G. Kolaitis,et al.  Conjunctive-Query Containment and Constraint Satisfaction , 2000, J. Comput. Syst. Sci..

[3]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[4]  Michèle Sebag,et al.  Distance Induction in First Order Logic , 1997, ILP.

[5]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[6]  Stefan Wrobel,et al.  Towards Discovery of Deep and Wide First-Order Structures: A Case Study in the Domain of Mutagenicity , 2001, Discovery Science.

[7]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[8]  Philippe Vismara,et al.  Union of all the Minimum Cycle Bases of a Graph , 1997, Electron. J. Comb..

[9]  Ronald Fagin,et al.  Degrees of acyclicity for hypergraphs and relational database schemes , 1983, JACM.

[10]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[11]  Yoram Singer,et al.  A simple, fast, and effective rule learner , 1999, AAAI 1999.

[12]  Georg Gottlob,et al.  The complexity of acyclic conjunctive queries , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).

[13]  Clement Yu,et al.  On determining tree query membership of a distributed query , 1980 .

[14]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[15]  M. de Rijke,et al.  Specifying Syntactic Structures. Studies in logic, Language and Information , 1997 .

[16]  Morris Plotkin,et al.  Mathematical Basis of Ring-Finding Algorithms in CIDS , 1971 .

[17]  Ivan Bratko,et al.  First Order Regression , 1997, Machine Learning.

[18]  Susanne Hoche,et al.  Relational Learning Using Constrained Confidence-Rated Boosting , 2001, ILP.

[19]  Jörg Flum,et al.  Finite model theory , 1995, Perspectives in Mathematical Logic.

[20]  Tamás Horváth,et al.  Learning logic programs with structured background knowledge , 2001, Artif. Intell..

[21]  Peter F. Stadler,et al.  Relevant Cycles in Biopolymers and Random Graphs , 1999 .

[22]  Ashwin Srinivasan,et al.  Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction , 1996, Artif. Intell..

[23]  Cosimo Anglano,et al.  An Experimental Evaluation of Coevolutive Concept Learning , 1998, ICML.

[24]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[25]  Kouichi Hirata,et al.  On the Hardness of Learning Acyclic Conjunctive Queries , 2000, ALT.

[26]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[27]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[28]  Shan-Hwei Nienhuys-Cheng,et al.  Foundations of Inductive Logic Programming , 1997, Lecture Notes in Computer Science.