A Relational Wrapper for RDF Reification

The importance of provenance information as a means to trust and validate the authenticity of available data cannot be stressed enough in today’s web-enabled world. The abundance of data now accessible due to the Internet explosion brings with it the related issue of determining how much of it is trustworthy. Provenance information, such as who is responsible for the data or how the data came to be, assists in the process of verifying the authenticity of the data. Semantic web technologies such as Resource Description Framework (RDF) include the ability to record such provenance information through the process of reification. RDF’s popularity has resulted in a demand for modeling and visualization tools. The work presented in this paper, called R2D, attempts to address this demand by innovatively integrating existing, stable technologies such as relational systems with the newer web technologies such as RDF. The work in this paper extends our earlier work by adding support for the RDF concept of reification. Reification enables the association of a level of trust and confidence with RDF triples, thereby enabling the ranking/validation of the authenticity of the triples. Details of the algorithmic enhancements to the various components of R2D that were made to support RDF reification are presented along with performance graphs for queries executed on a database containing crime records data from a police department.

[1]  Deborah L. McGuinness,et al.  Tracking RDF Graph Provenance using RDF Molecules , 2005 .

[2]  Cheng Niu,et al.  Location Normalization for Information Extraction , 2002, COLING.

[3]  Timothy W. Finin,et al.  RDF123: From Spreadsheets to RDF , 2008, SEMWEB.

[4]  David W. Embley,et al.  Automatic direct and indirect schema mapping: experiences and lessons learned , 2004, SGMD.

[5]  Hans-Jörg Schek,et al.  Interoperating Geographic Information Systems , 1999, Lecture Notes in Computer Science.

[6]  Shawn D. Newsam,et al.  Integrating gazetteers and remote sensed imagery , 2008, GIS '08.

[7]  Gilberto Câmara,et al.  GeoDiscover - A Specialized Search Engine to Discover Geospatial Data in the Web , 2005, GeoInfo.

[8]  Frank Wm. Tompa,et al.  Multi-column substring matching for database schema translation , 2006, VLDB.

[9]  Matthias Klusch,et al.  Automated semantic web service discovery with OWLS-MX , 2006, AAMAS '06.

[10]  Daniel Schwabe,et al.  Trust Policies for Semantic Web Repositories , 2006 .

[11]  Jeff Heflin,et al.  DLDB: Extending Relational Databases to Support Semantic Web Queries , 2003, PSSS.

[12]  Stefan Conrad,et al.  Bringing Relational Data into the SemanticWeb using SPARQL and Relational.OWL , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[13]  Isabel F. Cruz,et al.  Structure-Based Methods to Enhance Geospatial Ontology Alignment , 2007, GeoS.

[14]  Bijan Parsia The OWL-S Java API , 2004 .

[15]  Gerhard Weikum,et al.  RDF-3X: a RISC-style engine for RDF , 2008, Proc. VLDB Endow..

[16]  Daniel J. Abadi,et al.  Scalable Semantic Web Data Management Using Vertical Partitioning , 2007, VLDB.

[17]  Huajun Chen,et al.  RDF/RDFS-based Relational Database Integration , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[18]  Chris Clifton,et al.  SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks , 2000, Data Knowl. Eng..

[19]  Jens Lehmann,et al.  Triplify: light-weight linked data publication from relational databases , 2009, WWW '09.

[20]  Christian Wartena,et al.  Instanced-Based Mapping between Thesauri and Folksonomies , 2008, SEMWEB.

[21]  Hanan Samet,et al.  STEWARD: architecture of a spatio-textual search engine , 2007, GIS.

[22]  Ian Horrocks,et al.  A software framework for matchmaking based on semantic web technology , 2003, WWW '03.

[23]  Erik Wilde,et al.  The locative web , 2008, LocWeb.

[24]  Gwenn Englebienne,et al.  Learning Concept Mappings from Instance Similarity , 2008, SEMWEB.

[25]  Eugene Inseok Chong,et al.  An Efficient SQL-based RDF Querying Scheme , 2005, VLDB.

[26]  Shashi Shekhar,et al.  Discovering personal gazetteers: an interactive clustering approach , 2004, GIS '04.

[27]  Orri Erling,et al.  RDF Support in the Virtuoso DBMS , 2007, CSSW.

[28]  Jochen L. Leidner,et al.  Grounding spatial named entities for information extraction and question answering , 2003, HLT-NAACL 2003.

[29]  Ian Horrocks,et al.  Enabling knowledge representation on the Web by extending RDF schema , 2001, WWW '01.

[30]  Jeff Heflin,et al.  An Evaluation of Knowledge Base Systems for Large OWL Datasets , 2004, SEMWEB.

[31]  Ruy Luiz Milidiú,et al.  Towards Gazetteer Integration Through an Instance-based Thesauri Mapping Approach , 2006, GEOINFO.

[32]  Marine Carpuat,et al.  Boosting for Named Entity Recognition , 2002, CoNLL.

[33]  Wenfei Fan,et al.  Putting context into schema matching , 2006, VLDB.

[34]  James Cheney,et al.  Provenance management in curated databases , 2006, SIGMOD Conference.

[35]  Shashi Shekhar,et al.  Discovering personally meaningful places: An interactive clustering approach , 2007, TOIS.

[36]  Jiebo Luo,et al.  Inferring generic activities and events from image content and bags of geo-tags , 2008, CIVR '08.

[37]  Rachel Heery,et al.  What Is RDF , 1998 .

[38]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[39]  Kewei Tu,et al.  An Approach to RDF(S) Query, Manipulation and Inference on Databases , 2005, WAIM.

[40]  Kevin S. McCurley,et al.  Geospatial mapping and navigation of the web , 2001, WWW '01.

[41]  Frank van Harmelen,et al.  Using Google distance to weight approximate ontology matches , 2007, WWW '07.

[42]  John Mylopoulos,et al.  Refining Semantic Mappings from Relational Tables to Ontologies , 2004, SWDB.

[43]  Jane Hunter,et al.  A scale-out RDF molecule store for distributed processing of biomedical data , 2008, WWW 2008.

[44]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[45]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[46]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[47]  Isabel F. Cruz,et al.  Multi-layered approach to aligning heterogeneous ontologies , 2008 .

[48]  Isabel F. Cruz,et al.  A visual tool for ontology alignment to enable geospatial interoperability , 2007, J. Vis. Lang. Comput..

[49]  Martin L. Kersten,et al.  Column-store support for RDF data management: not all swans are white , 2008, Proc. VLDB Endow..

[50]  Marco A. Casanova,et al.  An Instance-based Approach for Matching Export Schemas of Geographical Database Web Services , 2007, GEOINFO.

[51]  Abraham Bernstein,et al.  Hexastore: sextuple indexing for semantic web data management , 2008, Proc. VLDB Endow..

[52]  S. Chittayasothorn,et al.  A Transformation from RDF Documents and Schemas to Relational Databases , 2007, 2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[53]  John Mylopoulos,et al.  Discovering the Semantics of Relational Tables Through Mappings , 2006, J. Data Semant..

[54]  Bhavani M. Thuraisingham,et al.  R2D: Extracting Relational Structure from RDF Stores , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[55]  Silvana Castano,et al.  Towards Effective Geographic Ontology Matching , 2007, GeoS.

[56]  Daniel J. Abadi,et al.  Column Stores for Wide and Sparse Data , 2007, CIDR.

[57]  Bhavani M. Thuraisingham,et al.  Content-based ontology matching for GIS datasets , 2008, GIS '08.

[58]  Max J. Egenhofer,et al.  Toward the semantic geospatial web , 2002, GIS '02.

[59]  Shiyong Lu,et al.  Semantics Preserving SPARQL-to-SQL Query Translation for Optional Graph Patterns. Technical Report T , 2006 .

[60]  Gregory R. Crane,et al.  Disambiguating Geographic Names in a Historical Digital Library , 2001, ECDL.

[61]  Deborah L. McGuinness,et al.  Bringing Semantics to Web Services: The OWL-S Approach , 2004, SWSWPC.

[62]  Shelley Powers,et al.  Practical RDF , 2003 .

[63]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[64]  Bhavani M. Thuraisingham,et al.  R2d: a Framework for the Relational Transformation of RDF Data , 2009, Int. J. Semantic Comput..

[65]  Werner Kuhn,et al.  Geospatial Semantics: Why, of What, and How? , 2005, J. Data Semant..

[66]  Nigel Shadbolt,et al.  SPARQL Query Processing with Conventional Relational Database Systems , 2005, WISE Workshops.

[67]  Anthony K. H. Tung,et al.  Validating Multi-column Schema Matchings by Type , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[68]  Li Ma,et al.  Efficiently querying rdf data in triple stores , 2008, WWW.

[69]  Kurt Rohloff,et al.  An Evaluation of Triple-Store Technologies for Large Data Stores , 2007, OTM Workshops.

[70]  Bhavani M. Thuraisingham,et al.  R2D: A Bridge between the Semantic Web and Relational Visualization Tools , 2009, 2009 IEEE International Conference on Semantic Computing.

[71]  Katia P. Sycara,et al.  An Efficient Algorithm for OWL-S Based Semantic Search in UDDI , 2004, SWSWPC.

[72]  Andrea Calì,et al.  Tightly Integrated Probabilistic Description Logic Programs for the Semantic Web , 2007, ICLP.

[73]  Bruno Pouliquen,et al.  Geographical information recognition and visualization in texts written in various languages , 2004, SAC '04.