Engineering ontology-based access to real-world data sources

The preparation of existing real-world datasets for publication as high-quality semantic web data is a complex task that requires the concerted execution of a variety of processing steps using a range of different tools. Faced with both changing input data and evolving requirements on the produced output, we face a significant engineering task for schema and data transformation. We argue that to achieve a robust and flexible transformation process, a high-level declarative description is needed, that can be used to drive the entire tool chain. We have implemented this idea for the deployment of ontology-based data access (OBDA) solutions, where semantically annotated views that integrate multiple data sources on different formats are created, based on an ontology and a collection of mappings. Furthermore, we exemplify our approach and show how a single declarative description helps to orchestrate a complete tool chain, beginning with the download of datasets, and through to the installation of the datasets for a variety of tool applications, including data and query transformation processes and reasoning services. Our case study is based on several publicly available tabular and relational datasets concerning the operations of the petroleum industry in Norway. We include a discussion of the relative performance of the used tools on our case study, and an overview of lessons learnt for practical deployment of OBDA on real-world datasets.

[1]  Nektarios Gioldasis,et al.  Semantic Based Access over XML Data , 2009, WSKS.

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  Ian Horrocks,et al.  OptiqueVQS: towards an ontology-based visual query system for big data , 2013, MEDES.

[4]  Ernesto Jiménez-Ruiz,et al.  Optique – Zooming In on Big Data Access , 2014 .

[5]  Daniel P. Miranker,et al.  Survey of directly mapping SQL databases to the Semantic Web , 2011, The Knowledge Engineering Review.

[6]  Diego Calvanese,et al.  Quest, an OWL 2 QL Reasoner for Ontology-based Data Access , 2012, OWLED.

[7]  Raphael Volz,et al.  Migrating data-intensive web sites into the Semantic Web , 2002, SAC '02.

[8]  Martin G. Skjæveland,et al.  Benefits of Publishing the Norwegian Petroleum Directorate's FactPages as Linked Open Data , 2013 .

[9]  Diego Calvanese,et al.  Quest: Effcient SPARQL-to-SQL for RDF and OWL , 2012, International Semantic Web Conference.

[10]  Asunción Gómez-Pérez,et al.  Ontological Engineering: With Examples from the Areas of Knowledge Management, e-Commerce and the Semantic Web , 2004, Advanced Information and Knowledge Processing.

[11]  Wolfram Wöß,et al.  XLWrap - Querying and Integrating Arbitrary Spreadsheets with SPARQL , 2009, SEMWEB.

[12]  Ian Horrocks,et al.  Publishing the Norwegian Petroleum Directorate's FactPages as Semantic Web Data , 2013, SEMWEB.

[13]  Carsten Lutz,et al.  Conservative Extensions in Expressive Description Logics , 2007, IJCAI.

[14]  Peter Haase,et al.  Optique: Zooming in on Big Data , 2015, Computer.

[15]  Volker Haarslev,et al.  On the Scalability of Description Logic Instance Retrieval , 2006, KI.

[16]  Orri Erling,et al.  RDF Support in the Virtuoso DBMS , 2007, CSSW.

[17]  Diego Calvanese,et al.  EQL-Lite: Effective First-Order Query Processing in Description Logics , 2007, IJCAI.

[18]  Daniel P. Miranker,et al.  OBDA: Query Rewriting or Materialization? In Practice, Both! , 2014, SEMWEB.

[19]  Nigel Shadbolt,et al.  Resource Description Framework (RDF) , 2009 .

[20]  Farid Cerbah Learning Highly Structured Semantic Repositories from Relational Databases: , 2008, ESWC.

[21]  Diego Calvanese Scalable End-User Access to Big Data , 2014 .

[22]  Eduard Constantin Dragut,et al.  Composing Mappings Between Schemas Using a Reference Ontology , 2004, CoopIS/DOA/ODBASE.

[23]  Christian Bizer,et al.  The Berlin SPARQL Benchmark , 2009, Int. J. Semantic Web Inf. Syst..

[24]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[25]  Evren Sirin,et al.  Evaluation of Query Rewriting Approaches for OWL 2 , 2012, SSWS+HPCSW@ISWC.

[26]  Diego Calvanese,et al.  Linking Data to Ontologies , 2008, J. Data Semant..

[27]  Michael Zakharyaschev,et al.  Ontology-Based Data Access: Ontop of Databases , 2013, SEMWEB.

[28]  Daniel P. Miranker,et al.  Ultrawrap: SPARQL execution on relational data , 2013, J. Web Semant..

[29]  Boris Motik,et al.  OWL 2 Web Ontology Language: structural specification and functional-style syntax , 2008 .

[30]  Diego Calvanese,et al.  The MASTRO system for ontology-based data access , 2011, Semantic Web.

[31]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[32]  John Mylopoulos,et al.  Inferring Complex Semantic Mappings Between Relational Tables and Ontologies from Simple Correspondences , 2005, OTM Conferences.

[33]  Christian Bizer,et al.  D2R Server - Publishing Relational Databases on the Semantic Web , 2004 .

[34]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[35]  Stathes Hadjiefthymiades,et al.  RONTO: relational to ontology schema matching , 2006 .

[36]  Yuzhong Qu,et al.  Discovering Simple Mappings Between Relational Database Schemas and Ontologies , 2007, ISWC/ASWC.

[37]  Diego Calvanese,et al.  The NPD Benchmark for OBDA Systems , 2014, SSWS@ISWC.

[38]  Leo Sauermann,et al.  Cool URIs for the semantic web , 2007 .

[39]  Jeff Heflin,et al.  An Evaluation of Knowledge Base Systems for Large OWL Datasets , 2004, SEMWEB.

[40]  Martin J. Dürst,et al.  Internationalized Resource Identifiers (IRIs) , 2005, RFC.

[41]  Diego Calvanese,et al.  Tractable Reasoning and Efficient Query Answering in Description Logics: The DL-Lite Family , 2007, Journal of Automated Reasoning.

[42]  Øyvind Hammer,et al.  Norwegian Offshore Stratigraphic Lexicon (NORLEX) , 2010 .