Learning OWL Class Expressions

With the advent of the Semantic Web and Semantic Technologies, ontologies have become one of the most prominent paradigms for knowledge representation and reasoning. The currently most popular ontology language OWL, based on description logics, became a W3C recommendation in 2004 and a standard for modelling ontologies on the Web. In the meantime, many studies and applications using OWL have been reported in research, many of which go beyond Internet usage and employ the power of ontological modelling in other fields such as biology, medicine, software engineering, knowledge management, and cognitive systems. However, recent progress in the field faces a lack of well-structured ontologies with large amounts of instance data due to the fact that engineering such ontologies requires a considerable investment of resources. Nowadays, knowledge bases often provide large volumes of data without sophisticated schemata. Methods for automated schema acquisition and maintenance are, therefore, sought. Furthermore, many classification and knowledge acquisition problems, e.g. the detection of chemical compounds causing cancer, can be handled by using the same techniques. In order to leverage machine-learning approaches for solving these tasks, it is required to develop methods and tools for learning concepts in description logics or, equivalently, class expressions in OWL. In this thesis, it is shown that methods from Inductive Logic Programming (ILP) are applicable to learning in description logic knowledge bases. The results provide foundations for the acquisition of OWL ontologies, in particular in cases when extensional information (facts, instance data) is easily available, while corresponding intensional information (schema) is missing or not expressive enough to allow powerful reasoning over the ontology in a useful way. Such situations often occur when extracting knowledge from different sources, e.g. databases and wikis, or in collaborative knowledge engineering scenarios. It can be argued that being able to learn OWL class expressions is a step towards enriching OWL knowledge bases in order to enable powerful reasoning, consistency checking, and improved querying possibilities. In particular, plugins for OWL ontology editors based on learning methods are developed and evaluated in this work. The developed algorithms are, of course, not restricted to ontology engineering and can handle other learning problems. Indeed, they lend themselves to generic use in machine learning in the same way as ILP systems do. The main difference, however, is the employed knowledge representation paradigm: ILP traditionally uses logic programs for knowledge representation, whereas this work rests on DLs/OWL. This distinction is crucial when considering Semantic Web applications as target use cases, as such applications hinge centrally on the chosen knowledge representation format for knowledge interchange and integration. The

[1]  Stefan Wrobel,et al.  A Logic-Based Approach to Relation Extraction from Texts , 2009, ILP.

[2]  Jens Lehmann,et al.  LinkedGeoData: Adding a Spatial Dimension to the Web of Data , 2009, SEMWEB.

[3]  Stephen Muggleton,et al.  Can ILP Be Applied to Large Datasets? , 2009, ILP.

[4]  Bijan Parsia,et al.  Laconic and Precise Justifications in OWL , 2008, SEMWEB.

[5]  Ivan Bratko,et al.  An Experiment in Robot Discovery with ILP , 2008, ILP.

[6]  Eyal Oren,et al.  Sindice.com: Weaving the Open Linked Data , 2007, ISWC/ASWC.

[7]  Nicola Fanizzi,et al.  A Semantic Similarity Measure for Expressive Description Logics , 2009, ArXiv.

[8]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[9]  Francesca A. Lisi,et al.  Learning SHIQ+log Rules for Ontology Evolution , 2008, SWAP.

[10]  Felix Naumann,et al.  Data Fusion in Three Steps: Resolving Schema, Tuple, and Value Inconsistencies , 2006, IEEE Data Eng. Bull..

[11]  Franz Baader,et al.  Computing the Least Common Subsumer w.r.t. a Background Terminology , 2004, Description Logics.

[12]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[13]  David Haussler,et al.  Occam's Razor , 1987, Inf. Process. Lett..

[14]  Nicola Fanizzi,et al.  DL-FOIL Concept Learning in Description Logics , 2008, ILP.

[15]  Ian Horrocks,et al.  The Even More Irresistible SROIQ , 2006, KR.

[16]  Georg Lausen,et al.  On exploiting classification taxonomies in recommender systems , 2008, AI Commun..

[17]  Shan-Hwei Nienhuys-Cheng,et al.  Existence and Nonexistence of Complete Refinement Operators , 1994, ECML.

[18]  Francesco Bergadano,et al.  Inductive Logic Programming: From Machine Learning to Software Engineering , 1995 .

[19]  Ehud Shapiro,et al.  Inductive Inference of Theories from Facts , 1991, Computational Logic - Essays in Honor of Alan Robinson.

[20]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[21]  G. Plotkin Automatic Methods of Inductive Inference , 1972 .

[22]  Dimitar Kazakov Combining LAPIS and WordNet for Learning of LR Parsers with Optimal Semantic Constraints , 1999, ILP.

[23]  Bernhard Ganter,et al.  Completing Description Logic Knowledge Bases Using Formal Concept Analysis , 2007, IJCAI.

[24]  Jens Lehmann,et al.  Learning of OWL Class Descriptions on Very Large Knowledge Bases , 2008, SEMWEB.

[25]  Saso Dzeroski,et al.  ILP Experiments in Detecting Traffic Problems , 1998, ECML.

[26]  Jens Lehmann,et al.  DBpedia Navigator , 2008 .

[27]  Donato Malerba,et al.  Ideal Refinement of Descriptions in AL-Log , 2003, ILP.

[28]  Dieter Fensel,et al.  Towards LarKC: A Platform for Web-Scale Reasoning , 2008, 2008 IEEE International Conference on Semantic Computing.

[29]  Jens Lehmann,et al.  A Refinement Operator Based Learning Algorithm for the ALC Description Logic , 2007, ILP.

[30]  Jens Lehmann,et al.  Foundations of Refinement Operators for Description Logics (Technical Report) , 2007 .

[31]  Jens Lehmann,et al.  Hybrid Learning of Ontology Classes , 2007, MLDM.

[32]  Raymond J. Mooney,et al.  Automated refinement of first-order horn-clause domain theories , 2005, Machine Learning.

[33]  Jens Lehmann,et al.  Discovering Unknown Connections - the DBpedia Relationship Finder , 2007, CSSW.

[34]  R. Studer,et al.  Semantic Web Technologies: Trends and Research in Ontology-based Systems , 2006 .

[35]  Kristian Kersting,et al.  An inductive logic programming approach to statistical relational learning , 2006, AI Commun..

[36]  Stephen Muggleton,et al.  Learning from Positive Data , 1996, Inductive Logic Programming Workshop.

[37]  Ian Horrocks,et al.  From SHIQ and RDF to OWL: the making of a Web Ontology Language , 2003, J. Web Semant..

[38]  Francesca A. Lisi,et al.  Under Consideration for Publication in Theory and Practice of Logic Programming Building Rules on Top of Ontologies for the Semantic Web with Inductive Logic Programming , 2007 .

[39]  Jens Lehmann,et al.  Ideal Downward Refinement in the EL Description Logic , 2009, ILP.

[40]  Jens Lehmann,et al.  Concept learning in description logics using refinement operators , 2009, Machine Learning.

[41]  Ashwin Srinivasan,et al.  The Predictive Toxicology Challenge 2000-2001 , 2001, Bioinform..

[42]  Vladik Kreinovich,et al.  Best student paper award , 1996, Reliab. Comput..

[43]  Francesca A. Lisi Principles of Inductive Reasoning on the Semantic Web: A Framework for Learning in AL-Log , 2005, PPSWR.

[44]  Jens Lehmann,et al.  What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content , 2007, ESWC.

[45]  Stephen Muggleton,et al.  Can ILP be Applied to Large Dataset ? , 2010 .

[46]  Ashwin Srinivasan,et al.  Lattice-Search Runtime Distributions May Be Heavy-Tailed , 2002, ILP.

[47]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[48]  Katharina Morik,et al.  A Polynomial Approach to the Constructive Induction of Structural Knowledge , 2004, Machine Learning.

[49]  A. Agresti,et al.  Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions , 1998 .

[50]  Johanna Völker,et al.  Acquisition of OWL DL Axioms from Lexical Resources , 2007, ESWC.

[51]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[52]  Yuzhong Qu,et al.  Searching Semantic Web Objects Based on Class Hierarchies , 2008, LDOW.

[53]  Jens Lehmann,et al.  Semantische Mashups auf Basis Vernetzter Daten , 2009, Social Semantic Web.

[55]  Luigi Iannone,et al.  An Algorithm Based on Counterfactuals for Concept Learning in the Semantic Web , 2005, IEA/AIE.

[56]  Steffen Staab,et al.  International Handbooks on Information Systems , 2013 .

[57]  Thomas G. Dietterich,et al.  Structured machine learning: the next ten years , 2008, Machine Learning.

[58]  Henrik Boström,et al.  Theory-Guideed Induction of Logic Programs by Inference of Regular Languages , 1996, ICML.

[59]  Lydia B. Chilton,et al.  Tabulator: Exploring and Analyzing linked data on the Semantic Web , 2006 .

[60]  Raphael Volz,et al.  Patching Syntax in OWL Ontologies , 2004, SEMWEB.

[61]  Jens Lehmann,et al.  Towards Semantic based Requirements Engineering , 2007 .

[62]  Luigi Iannone,et al.  Knowledge-Intensive Induction of Terminologies from Metadata , 2004, SEMWEB.

[63]  William W. Cohen,et al.  Learning the Classic Description Logic: Theoretical and Experimental Results , 1994, KR.

[64]  Ashwin Srinivasan,et al.  Carcinogenesis Predictions Using ILP , 1997, ILP.

[65]  Jens Lehmann,et al.  Extracting reduced logic programs from artificial neural networks , 2010, Applied Intelligence.

[66]  Umberto Straccia,et al.  Reasoning within Fuzzy Description Logics , 2011, J. Artif. Intell. Res..

[67]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[68]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[69]  Michel Dumontier,et al.  SMART: A Web-Based, Ontology-Driven, Semantic Web Query Answering Application , 2007, Semantic Web Challenge.

[70]  Johanna Völker,et al.  Fostering Web Intelligence by Semi-automatic OWL Ontology Refinement , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[71]  Edith Schonberg,et al.  The Summary Abox: Cutting Ontologies Down to Size , 2006, SEMWEB.

[72]  Ralf Küsters Non-Standard Inferences in Description Logics , 2001, Lecture Notes in Computer Science.

[73]  Alessandro Giuliani,et al.  Putting the Predictive Toxicology Challenge Into Perspective: Reflections on the Results , 2003, Bioinform..

[74]  Sebastian Rudolph,et al.  Exploring Relational Structures Via FLE , 2004, ICCS.

[75]  A. Swartz MusicBrainz: A Semantic Web Service , 2002, IEEE Intell. Syst..

[76]  Alan L. Rector,et al.  Why do it the hard way? The Case for an Expressive Description Logic for SNOMED , 2008, KR-MED.

[77]  Jos de Bruijn,et al.  OWL DL vs. OWL flight: conceptual modeling and reasoning for the semantic Web , 2005, WWW '05.

[78]  Liviu Badea,et al.  A Refinement Operator for Description Logics , 2000, ILP.

[79]  Sebastian Rudolph,et al.  Foundations of Semantic Web Technologies , 2009 .

[80]  Luigi Iannone,et al.  Downward refinement in the ALN description logic , 2004, Fourth International Conference on Hybrid Intelligent Systems (HIS'04).

[81]  Ashwin Srinivasan,et al.  Relating chemical activity to structure: An examination of ILP successes , 1995, New Generation Computing.

[82]  David Page,et al.  An Empirical Evaluation of Bagging in Inductive Logic Programming , 2002, ILP.

[83]  Liviu Badea,et al.  Refinement Operators Can Be (Weakly) Perfect , 1999, ILP.

[84]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[85]  Sebastian Rudolph,et al.  Description Logic Rules , 2010, ECAI.

[86]  Raymond J. Mooney,et al.  Inductive Logic Programming for Natural Language Processing , 1996, Inductive Logic Programming Workshop.

[87]  Simon Colton,et al.  Boosting Descriptive ILP for Predictive Learning in Bioinformatics , 2006, ILP.

[88]  Jürgen Umbrich,et al.  SWSE: Objects before documents! , 2008 .

[89]  Raymond J. Mooney,et al.  Induction of First-Order Decision Lists: Results on Learning the Past Tense of English Verbs , 1995, J. Artif. Intell. Res..

[90]  I. Horrocks,et al.  The Instance Store: DL Reasoning with Large Numbers of Individuals , 2004, Description Logics.

[91]  Ronald J. Brachman,et al.  A Structural Paradigm for Representing Knowledge. , 1978 .

[92]  Donato Malerba,et al.  Inductive learning from numerical and symbolic data: An integrated framework , 2001, Intell. Data Anal..

[93]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[94]  Benjamin M. Good,et al.  Bio2RDF : A Semantic Web Atlas of Post Genomic Knowledge about Human and Mouse , 2008, DILS.

[95]  E. Lamma,et al.  Learning Three-Valued Logic Programs , 1999 .

[96]  Nicola Fanizzi,et al.  Reasoning by Analogy in Description Logics Through Instance-based Learning , 2006, SWAP.

[97]  Jens Lehmann,et al.  RelFinder: Revealing Relationships in RDF Knowledge Bases , 2009, SAMT.

[98]  York Sure-Vetter,et al.  Learning Disjointness , 2007, ESWC.

[99]  Christian Bizer,et al.  Media Meets Semantic Web - How the BBC Uses DBpedia and Linked Data to Make Connections , 2009, ESWC.

[100]  Philipp Cimiano,et al.  Ontology Learning from Text: Methods, Evaluation and Applications , 2005 .

[101]  Thomas Lukasiewicz,et al.  Expressive probabilistic description logics , 2008, Artif. Intell..

[102]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[103]  Peter F. Patel-Schneider,et al.  A comparison of two modelling paradigms in the Semantic Web , 2007, J. Web Semant..

[104]  Sebastian Rudolph,et al.  Computing intensional answers to questions - An inductive logic programming approach , 2010, Data Knowl. Eng..

[105]  Alan L. Rector,et al.  Web ontology segmentation: analysis, classification and use , 2006, WWW '06.

[106]  Sebastian Rudolph,et al.  ELP: Tractable Rules for OWL 2 , 2008, SEMWEB.

[107]  Franz Baader,et al.  Tractable and Decidable Fragments of Conceptual Graphs , 1999, ICCS.

[108]  Leon van der Torre,et al.  Constructing Refinement Operators by Decomposing Logical Implication , 1993, AI*IA.

[109]  Luc De Raedt,et al.  Relational Knowledge Discovery in Databases , 1996, Inductive Logic Programming Workshop.

[110]  Nicola Fanizzi,et al.  Spaces of Theories with Ideal Refinement Operators , 2003, IJCAI.

[111]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[112]  Nicola Fanizzi,et al.  Query Answering and Ontology Population: An Inductive Approach , 2008, ESWC.

[113]  Jens Lehmann,et al.  Triplify: light-weight linked data publication from relational databases , 2009, WWW '09.

[114]  Alexander Borgida,et al.  Computing Least Common Subsumers in Description Logics , 1992, AAAI.

[115]  Olivier Bodenreider,et al.  Investigating subsumption in SNOMED CT: An exploration into large description logic-based biomedical terminologies , 2007, Artif. Intell. Medicine.

[116]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[117]  Sören Auer,et al.  OntoWiki: A Tool for Social, Semantic Collaboration , 2006, CKC.

[118]  Jens Lehmann,et al.  Foundations of Refinement Operators for Description Logics , 2007, ILP.

[119]  Nicola Fanizzi,et al.  A Note on the Evaluation of Inductive Concept Classification Procedures , 2008, SWAP.

[120]  Jens Lehmann,et al.  DL-Learner: Learning Concepts in Description Logics , 2009, J. Mach. Learn. Res..

[121]  Jens Lehmann,et al.  DBpedia Live Extraction , 2009, OTM Conferences.

[122]  Ashwin Srinivasan,et al.  Statistical Evaluation of the Predictive Toxicology Challenge 2000-2001 , 2003, Bioinform..

[123]  Saso Dzeroski,et al.  Detecting Traffic Problems with ILP , 1998, ILP.

[124]  Luc De Raedt Statistical relational learning: an inductive logic programming perspective , 2005 .

[125]  Pedro M. Domingos Occam's Two Razors: The Sharp and the Blunt , 1998, KDD.

[126]  Luc De Raedt,et al.  Generalizing Refinement Operators to Learn Prenex Conjunctive Normal Forms , 1999, ILP.

[127]  C. Bizer,et al.  DBpedia Mobile : A Location-Aware Semantic Web Client , 2008 .