Decentralized Collaborative Knowledge Management using Git

The World Wide Web and the Semantic Web are designed as a network of distributed services and datasets. The distributed character of the Web brings manifold collaborative possibilities to interchange data. The commonly adopted collaborative solutions for RDF data are centralized (e.g. SPARQL endpoints and wiki systems). But to support distributed collaboration, a system is needed, that supports divergence of datasets, brings the possibility to conflate diverged states, and allows distributed datasets to be synchronized. In this paper, we present Quit Store, it was inspired by and it builds upon the successful Git system. The approach is based on a formal expression of evolution and consolidation of distributed datasets. During the collaborative curation process, the system automatically versions the RDF dataset and tracks provenance information. It also provides support to branch, merge, and synchronize distributed RDF datasets. The merging process is guarded by specific merge strategies for RDF data. Finally, we use our reference implementation to show overall good performance and demonstrate the practical usability of the system.

[1]  Ljiljana Stojanovic,et al.  Consistent Evolution of OWL Ontologies , 2005, ESWC.

[2]  Philipp Frischmuth,et al.  Weaving a Distributed, Semantic Social Network for Mobile Users , 2011, ESWC.

[3]  Rik Van de Walle,et al.  R&Wbase: git for triples , 2013, LDOW.

[4]  Andrew Davison The Unix Philosophy , 1995 .

[5]  Philipp Frischmuth,et al.  OntoWiki 1.0: 10 Years of Development - What's New in OntoWiki , 2016, SEMANTiCS.

[6]  Philipp Frischmuth,et al.  OntoWiki - An authoring, publication and visualization interface for the Data Web , 2015, Semantic Web.

[7]  Natanael Arndt,et al.  AMSL - Creating a Linked Data Infrastructure for Managing Electronic Resources in Libraries , 2014, International Semantic Web Conference.

[8]  Jens Lehmann,et al.  Databugger: a test-driven framework for debugging the web of data , 2014, WWW '14 Companion.

[9]  Markus Krötzsch,et al.  Semantic MediaWiki , 2006, Foundations for the Web of Information and Services.

[10]  Natanael Arndt,et al.  Distributed Collaboration on RDF Datasets Using Git: Towards the Quit Store , 2016, SEMANTICS.

[11]  Natanael Arndt,et al.  AMSL - Managing Electronic Resources for Libraries Based on Semantic Web , 2014, GI-Jahrestagung.

[12]  Thomas Riechert,et al.  Collaborative Research on Academic History using Linked Open Data: A Proposal for the Heloise Common Research Model , 2016 .

[13]  Aidan Hogan,et al.  Skolemising Blank Nodes while Preserving Isomorphism , 2015, WWW.

[14]  Jens Lehmann,et al.  Test-driven evaluation of linked data quality , 2014, WWW.

[15]  Leon Urbas,et al.  Open Semantic Revision Control with R43ples: Extending SPARQL to access revisions of Named Graphs , 2016, SEMANTICS.

[16]  Irlán Grangel-González,et al.  VoCol: An Integrated Environment to Support Version-Controlled Vocabulary Development , 2016, EKAW.

[17]  Philipp Frischmuth,et al.  An architecture of a distributed semantic social network , 2014, Semantic Web.

[18]  Sören Auer,et al.  A Versioning and Evolution Framework for RDF Knowledge Bases , 2006, Ershov Memorial Conference.

[19]  Natanael Arndt,et al.  Exploring the Evolution and Provenance of Git Versioned RDF Data , 2017, MEPDaW/LDQ@ESWC.

[20]  Jürgen Umbrich,et al.  Towards Efficient Archiving of Dynamic Linked Open Data , 2015, DIACRON@ESWC.

[21]  Natanael Arndt,et al.  Xodx - A node for the Distributed Semantic Social Network , 2014, International Semantic Web Conference.

[22]  Natanael Arndt,et al.  Publish and Subscribe for RDF in Enterprise Value Networks , 2016, LDOW@WWW.

[23]  Christian Bizer,et al.  The Berlin SPARQL Benchmark , 2009, Int. J. Semantic Web Inf. Syst..

[24]  Rik Van de Walle,et al.  Git2PROV: Exposing Version Control System Content as W3C PROV , 2013, International Semantic Web Conference.

[25]  Philipp Frischmuth,et al.  Structured Feedback: A Distributed Protocol for Feedback and Patches on the Web of Data , 2016, LDOW@WWW.

[26]  Natanael Arndt,et al.  Decentralized Evolution and Consolidation of RDF Graphs , 2017, ICWE.

[27]  Natanael Arndt,et al.  Towards Versioning of Arbitrary RDF Data , 2016, SEMANTICS.

[28]  Paul T. Groth,et al.  Requirements for Provenance on the Web , 2012, Int. J. Digit. Curation.

[29]  Yogesh L. Simmhan,et al.  The Open Provenance Model core specification (v1.1) , 2011, Future Gener. Comput. Syst..

[30]  Roy Meissner,et al.  Using DevOps Principles to Continuously Monitor RDF Data Quality , 2016, SEMANTICS.

[31]  Harald Sack,et al.  TailR: a platform for preserving history on the web of data , 2015, SEMANTICS.

[32]  Tim Berners-Lee,et al.  Delta: an ontology for the distribution of differences between RDF graphs , 2004 .

[33]  Irlán Grangel-González,et al.  Git4Voc: Git-Based Versioning for Collaborative Vocabulary Development , 2016, 2016 IEEE Tenth International Conference on Semantic Computing (ICSC).

[34]  Steve Cassidy,et al.  Version Control for RDF Triple Stores , 2007, ICSOFT.

[35]  Axel Polleres,et al.  Binary RDF representation for publication and exchange (HDT) , 2013, J. Web Semant..

[36]  Mike Gancarz Linux and the Unix philosophy , 2003 .

[37]  Jens Lehmann,et al.  Introduction to Linked Data and Its Lifecycle on the Web , 2013, Reasoning Web.

[38]  Michael Martin,et al.  Knowledge Engineering for Historians on the Example of the Catalogus Professorum Lipsiensis , 2010, SEMWEB.

[39]  Giovanni Tummarello,et al.  RDFSync: Efficient Remote Synchronization of RDF Models , 2007, ISWC/ASWC.