Europeana RDF Store Report

Expressing data in RDF is one of the principles to be considered when making data available as Linked Data on the Web. This can be achieved using RDF-wrappers for existing (relational) data stores or by using RDF stores as data repositories. The latter requires special RDF storage solutions, many of which are available today. Organizations often have difficulties to decide which solution they should adopt because comprehensive comparisons of existing RDF stores are hardly available and experiences w.r.t performance and scalability are still missing. In this report, we summarize the results of qualitative and quantitative study we carried out on existing RDF stores in the context of the European Digital Library project. We give a detailed overview on existing RDF store solutions, analyze their functional and non-functional features, summarize the outcomes of other, previously carried out studies, and conduct a Linked-Data oriented performance evaluation on a subset of existing triple stores w.r.t to load and query time. The results of this study show that certain RDF stores, such as OpenLink Virtuoso or 4Store, can deal with the Europeana data volume and answer those SPARQL queries that are relevant for exposing Europeana metadata as Linked Data in an acceptable time-range.

[1]  Le Gruenwald,et al.  A survey of data replication techniques for mobile ad hoc network databases , 2008, The VLDB Journal.

[2]  Daniel J. Abadi,et al.  SW-Store: a vertically partitioned DBMS for Semantic Web data management , 2009, The VLDB Journal.

[3]  Philip A. Bernstein,et al.  Concurrency Control in Distributed Database Systems , 1986, CSUR.

[4]  Heiner Stuckenschmidt,et al.  RDF Storage and Retrieval Systems , 2009, Handbook on Ontologies.

[5]  Nigel Shadbolt,et al.  Resource Description Framework (RDF) , 2009 .

[6]  Sang Hyuk Son,et al.  Replicated data management in distributed database systems , 1988, SGMD.

[7]  Ryan Lee,et al.  Scalability Report on Triple Store Applications , 2004 .

[8]  Jeremy J. Carroll,et al.  Named graphs, provenance and trust , 2005, WWW '05.

[9]  Kathryn S. McKinley,et al.  Partial replica selection based on relevance for information retrieval , 1999, SIGIR '99.

[10]  Jens Lehmann,et al.  Triplify: light-weight linked data publication from relational databases , 2009, WWW '09.

[11]  Bernhard Haslhofer,et al.  The OAI2LOD Server: Exposing OAI-PMH Metadata as Linked Data , 2008, LDOW.

[12]  Amit P. Sheth,et al.  Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data , 2010, SSDBM.

[13]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[14]  James Cheney,et al.  Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[15]  Bernhard Schandl TripFS Exposing File Systems as Linked Data , 2009, I-SEMANTICS.

[16]  Orri Erling,et al.  RDF Support in the Virtuoso DBMS , 2007, CSSW.

[17]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[18]  Nigel Shadbolt,et al.  4s-reasoner: RDFS Backward Chained Reasoning Support in 4store , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[19]  Georg Lausen,et al.  An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario , 2008, SEMWEB.

[20]  Kurt Rohloff,et al.  An Evaluation of Triple-Store Technologies for Large Data Stores , 2007, OTM Workshops.

[21]  Richard Cyganiak,et al.  NG4J - Named Graphs API for Jena , 2005 .

[22]  Wolfgang Nejdl,et al.  Benchmarking Fulltext Search Performance of RDF Stores , 2009, ESWC.

[23]  Jim Gray,et al.  The Transaction Concept: Virtues and Limitations (Invited Paper) , 1981, VLDB.

[24]  Sören Auer,et al.  A Versioning and Evolution Framework for RDF Knowledge Bases , 2006, Ershov Memorial Conference.

[25]  Diomidis Spinellis,et al.  A survey of peer-to-peer content distribution technologies , 2004, CSUR.

[26]  Bo Hu,et al.  An Evaluation of RDF Storage Systems for Large Data Applications , 2005, 2005 First International Conference on Semantics, Knowledge and Grid.

[27]  Daniel J. Abadi,et al.  Using The Barton Libraries Dataset As An RDF benchmark , 2007 .

[28]  Tim Furche,et al.  Web and Semantic Web Query Languages: A Survey , 2005, Reasoning Web.

[29]  Yasushi Saito,et al.  Optimistic replication , 2005, CSUR.

[30]  J. Broekstra,et al.  Storage, Querying and Inferencing for Semantic Web Languages , 2005 .

[31]  Vassilis Christophides,et al.  On the Foundations of Computing Deltas Between RDF Models , 2007, ISWC/ASWC.

[32]  Sandro Reichert A secure data repository for semantic federation of product information , 2009, iiWAS.

[33]  Alisdair Owens,et al.  An Investigation into Improving RDF Store Performance , 2009 .