Fast Approximate A-Box Consistency Checking Using Machine Learning

Ontology reasoning is typically a computationally intensive operation. While soundness and completeness of results is required in some use cases, for many others, a sensible trade-off between computation efforts and correctness of results makes more sense. In this paper, we show that it is possible to approximate a central task in reasoning, i.e., A-box consistency checking, by training a machine learning model which approximates the behavior of that reasoner for a specific ontology. On four different datasets, we show that such learned models constantly achieve an accuracy above 95i¾ź% at less than 2i¾ź% of the runtime of a reasoner, using a decision tree with no more than 20 inner nodes. For example, this allows for validating 293M Microdata documents against the schema.org ontology in less than 90i¾źmin, compared to 18 days required by a state of the art ontology reasoner.

[1]  Markus Krötzsch,et al.  The Incredible ELK , 2013, Journal of Automated Reasoning.

[2]  Michael Lawley,et al.  Snorocket 2.0: Concrete Domains and Concurrent Classification , 2013, ORE.

[3]  Stephan Bloehdorn,et al.  Graph Kernels for RDF Data , 2012, ESWC.

[4]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[5]  Nicola Fanizzi,et al.  Query Answering and Ontology Population: An Inductive Approach , 2008, ESWC.

[6]  Nicola Fanizzi,et al.  Towards Evidence-Based Terminological Decision Trees , 2014, IPMU.

[7]  Christoph Weidenbach,et al.  On the Saturation of YAGO , 2010, IJCAR.

[8]  Jens Lehmann,et al.  Class expression learning for ontology engineering , 2011, J. Web Semant..

[9]  Franz Baader,et al.  CEL - A Polynomial-Time Reasoner for Life Science Ontologies , 2006, IJCAR.

[10]  Peter F. Patel-Schneider,et al.  Analyzing Schema.org , 2014, SEMWEB.

[11]  Johanna Völker,et al.  Statistical Schema Induction , 2011, ESWC.

[12]  Robert Meersman,et al.  On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE , 2003, Lecture Notes in Computer Science.

[13]  Christian Bizer,et al.  The WebDataCommons Microdata, RDFa and Microformat Dataset Series , 2014, International Semantic Web Conference.

[14]  Bernardo Cuenca Grau,et al.  OWL 2 Web Ontology Language: Profiles , 2009 .

[15]  Yarden Katz,et al.  Pellet: A practical OWL-DL reasoner , 2007, J. Web Semant..

[16]  Heiko Paulheim What the Adoption of schema.org Tells About Linked Open Data , 2015, USEWOD-PROFILES@ESWC.

[17]  Nicola Fanizzi,et al.  On the Effectiveness of Evidence-Based Terminological Decision Trees , 2015, ISMIS.

[18]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[19]  Heiner Stuckenschmidt,et al.  Approximating Description Logic Classification for Semantic Web Reasoning , 2005, ESWC.

[20]  Aldo Gangemi,et al.  Understanding the Semantic Web through Descriptions and Situations , 2003, OTM.

[21]  Nicola Fanizzi,et al.  Inductive Classification Through Evidence-Based Models and Their Ensembles , 2015, ESWC.

[22]  Birte Glimm,et al.  Konclude: System description , 2014, J. Web Semant..

[23]  James A. Hendler,et al.  Agents and the Semantic Web , 2001, IEEE Intell. Syst..

[24]  Heiko Paulheim,et al.  Detecting Incorrect Numerical Data in DBpedia , 2014, ESWC.

[25]  Nicola Fanizzi,et al.  Statistical Learning for Inductive Query Answering on OWL Ontologies , 2008, SEMWEB.

[26]  Gavin Brown,et al.  Predicting Performance of OWL Reasoners: Locally or Globally? , 2014, KR.

[27]  Timothy W. Finin,et al.  Information retrieval on the semantic web , 2002, CIKM '02.

[28]  Heiko Paulheim,et al.  Improving the Quality of Linked Data Using Statistical Distributions , 2014, Int. J. Semantic Web Inf. Syst..

[29]  Nicola Guarino,et al.  Sweetening WORDNET with DOLCE , 2003, AI Mag..

[30]  Steven de Rooij,et al.  A Fast and Simple Graph Kernel for RDF , 2013, DMoLD.

[31]  I. Horrocks,et al.  A Tableau Decision Procedure for $\mathcal{SHOIQ}$ , 2007, Journal of Automated Reasoning.

[32]  Shonali Krishnaswamy,et al.  Predicting Reasoning Performance Using Ontology Metrics , 2012, SEMWEB.

[33]  R. Doyle The American terrorist. , 2001, Scientific American.

[34]  Boris Motik,et al.  HermiT: An OWL 2 Reasoner , 2014, Journal of Automated Reasoning.

[35]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[36]  Enrico Motta,et al.  Ontology summarization: an analysis and an evaluation , 2010, IWEST@ISWC.

[37]  Aldo Gangemi,et al.  Serving DBpedia with DOLCE - More than Just Adding a Cherry on Top , 2015, International Semantic Web Conference.

[38]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[39]  Nicola Fanizzi,et al.  Tackling the Class-Imbalance Learning Problem in Semantic Web Knowledge Bases , 2014, EKAW.

[40]  Nicola Fanizzi,et al.  Induction of robust classifiers for web ontologies through kernel machines , 2012, J. Web Semant..

[41]  Stuart E. Middleton,et al.  Ontology-based Recommender Systems , 2004, Handbook on Ontologies.

[42]  Jeff Z. Pan,et al.  Soundness Preserving Approximation for TBox Reasoning , 2010, AAAI.

[43]  Marco Schaerf,et al.  Approximation in Concept Description Languages , 1992, KR.

[44]  Marco Schaerf,et al.  Tractable Reasoning via Approximation , 1995, Artif. Intell..

[45]  Heiko Paulheim,et al.  A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary over Time , 2015, WIMS.

[46]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[47]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[48]  Ian Horrocks,et al.  A Description Logic Based Schema for the Classification of Medical Data , 1996, KRDB.

[49]  Ian Horrocks,et al.  FaCT++ Description Logic Reasoner: System Description , 2006, IJCAR.

[50]  Jeff Z. Pan,et al.  The Semantic Web: Research and Applications - 8th Extended Semantic Web Conference, ESWC 2011, Heraklion, Crete, Greece, May 29-June 2, 2011, Proceedings, Part I , 2010, ESWC.

[51]  Chris D. Nugent,et al.  Ontology-based activity recognition in intelligent pervasive environments , 2009, Int. J. Web Inf. Syst..

[52]  Volker Haarslev,et al.  Racer: A Core Inference Engine for the Semantic Web , 2003, EON.