Inductive Logic Programming

Integrating heterogeneous data from sources as diverse as web pages, digital libraries, knowledge bases, the Semantic Web and databases is an open problem. The ultimate aim of our work is to be able to query such heterogeneous data sources as if their data were conveniently held in a single relational database. Pursuant to this aim, we propose a generalisation of joins from the relational database model to enable joins on arbitrarily complex structured data in a higher-order representation. By incorporating kernels and distances for structured data, we further extend this model to support approximate joins of heterogeneous data. We demonstrate the flexibility of our approach in the publications domain by evaluating example approximate queries on the CORA data sets, joining on types ranging from sets of co-authors through to entire publications.

[1]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Hans van Halteren,et al.  Improving Data Driven Wordclass Tagging by System Combination , 1998, ACL.

[3]  C. Lee Giles,et al.  Autonomous citation matching , 1999, AGENTS '99.

[4]  Tamás Horváth,et al.  Learning logic programs by using the product homomorphism method , 1997, COLT '97.

[5]  E. F. Codd,et al.  The Relational Model for Database Management, Version 2 , 1990 .

[6]  Martin Eineborg,et al.  Induction of Constraint Grammar-Rules Using Progol , 1998, ILP.

[7]  Saso Dzeroski,et al.  Inductive Learning in Deductive Databases , 1993, IEEE Trans. Knowl. Data Eng..

[8]  C. J. Date An Introduction to Database Systems , 1975 .

[9]  Eric Brill,et al.  Some Advances in Transformation-Based Part of Speech Tagging , 1994, AAAI.

[10]  Saso Dzeroski,et al.  Inductive Logic Programming 7th International Workshop, Ilp-97, Prague, Czech Republic, September 17-20, 1997 : Proceedings , 1997 .

[11]  Alonzo Church,et al.  A formulation of the simple theory of types , 1940, Journal of Symbolic Logic.

[12]  Shan-Hwei Nienhuys-Cheng,et al.  Distance Between Herbrand Interpretations: A Measure for Approximations to a Target Concept , 1997, ILP.

[13]  James Cussens Part-of-Speech Tagging Using Progol , 1997, ILP.

[14]  Tibor Gyimóthy,et al.  Learning Semantic Functions of Attribute Grammars , 1997, Nord. J. Comput..

[15]  Tamás Horváth,et al.  Learning logic programs with structured background knowledge , 2001, Artif. Intell..

[16]  Michèle Sebag,et al.  Distance Induction in First Order Logic , 1997, ILP.

[17]  Stefan Wrobel,et al.  Term Comparisons in First-Order Similarity Measures , 1998, ILP.

[18]  Mathias Kirsten,et al.  Extending K-Means Clustering to First-Order Representations , 2000, ILP.

[19]  Dietrich Wettschereck,et al.  Relational Instance-Based Learning , 1996, ICML.

[20]  James Cussens,et al.  Using Prior Probabilities and Density Estimation for Relational Classification , 1998, ILP.