Co-evolution of RDF Datasets

Linking Data initiatives have fostered the publication of large number of RDF datasets in the Linked Open Data (LOD) cloud, as well as the development of query processing infrastructures to access these data in a federated fashion. However, different experimental studies have shown that availability of LOD datasets cannot be always ensured, being RDF data replication required for envisioning reliable federated query frameworks. Albeit enhancing data availability, RDF data replication requires synchronization and conflict resolution when replicas and source datasets are allowed to change data over time, i.e., co-evolution management needs to be provided to ensure consistency. In this paper, we tackle the problem of RDF data co-evolution and devise an approach for conflict resolution during co-evolution of RDF datasets. Our proposed approach is property-oriented and allows for exploiting semantics about RDF properties during co-evolution management. The quality of our approach is empirically evaluated in different scenarios on the DBpedia-live dataset. Experimental results suggest that proposed proposed techniques have a positive impact on the quality of data in source datasets and replicas.

[1]  Jens Bleiholder,et al.  Data fusion and conflict resolution in integrated information systems , 2010 .

[2]  Heiko Paulheim,et al.  Adoption of the Linked Data Best Practices in Different Topical Domains , 2014, SEMWEB.

[3]  M. Tamer Özsu,et al.  Conflict tolerant queries in AURORA , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[4]  Felix Naumann,et al.  Automatic Data Fusion with HumMer , 2005, VLDB.

[5]  Christian Bizer,et al.  Learning conflict resolution strategies for cross-language Wikipedia data fusion , 2014, WWW '14 Companion.

[6]  Martin Necaský,et al.  Linked Data Integration with Conflicts , 2014, ArXiv.

[7]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[8]  Martin Necaský,et al.  ODCleanStore: A Framework for Managing and Providing Integrated Linked Data on the Web , 2012, WISE.

[9]  Christian Bizer,et al.  Sieve: linked data quality assessment and fusion , 2012, EDBT-ICDT '12.

[10]  Vassilis Christophides,et al.  On Detecting High-Level Changes in RDF/S KBs , 2009, SEMWEB.

[11]  Maria-Esther Vidal,et al.  Federated SPARQL Queries Processing with Replicated Fragments , 2015, International Semantic Web Conference.

[12]  Kemele M. Endris,et al.  iRap - an Interest-Based RDF Update Propagation Framework , 2015, International Semantic Web Conference.

[13]  Robert Isele,et al.  LDIF - Linked Data Integration Framework , 2011, COLD.

[14]  Sören Auer,et al.  A Versioning and Evolution Framework for RDF Knowledge Bases , 2006, Ershov Memorial Conference.

[15]  Bernhard Schandl Replication and Versioning of Partial RDF Graphs , 2010, ESWC.

[16]  Amihai Motro,et al.  Fusionplex: resolution of data inconsistencies in the integration of heterogeneous information sources , 2006, Inf. Fusion.

[17]  Kemele M. Endris,et al.  Interest-Based RDF Update Propagation , 2015, International Semantic Web Conference.

[18]  Ronald R. Yager,et al.  A framework for multi-source data fusion , 2004, Inf. Sci..

[19]  Norman W. Paton,et al.  Pay-as-you-go data integration for linked data: opportunities, challenges and architectures , 2012, SWIM '12.

[20]  Manfred Hauswirth,et al.  DAW: Duplicate-AWare Federated Query Processing over the Web of Data , 2013, SEMWEB.

[21]  Giovanni Tummarello,et al.  RDFSync: Efficient Remote Synchronization of RDF Models , 2007, ISWC/ASWC.

[22]  Hala Skaf-Molli,et al.  C-Set: a Commutative Replicated Data Type for Semantic Stores , 2011, RED@ESWC.

[23]  G. Konstantinidis Ontology Evolution : A Framework and its Application to RDF , 2007 .

[24]  Olivier Corby,et al.  Col-Graph: Towards Writable and Scalable Linked Open Data , 2014, SEMWEB.

[25]  Rik Van de Walle,et al.  Querying Datasets on the Web with High Availability , 2014, SEMWEB.

[26]  Jürgen Umbrich,et al.  SPARQL Web-Querying Infrastructure: Ready for Action? , 2013, SEMWEB.