SP^2Bench: A SPARQL Performance Benchmark

Recently, the SPARQL query language for RDF has reached the W3C recommendation status. In response to this emerging standard, the database community is currently exploring efficient storage techniques for RDF data and evaluation strategies for SPARQL queries. A meaningful analysis and comparison of these approaches necessitates a comprehensive and universal benchmark platform. To this end, we have developed SP^2Bench, a publicly available, language-specific SPARQL performance benchmark. SP^2Bench is settled in the DBLP scenario and comprises both a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries. The generated documents mirror key characteristics and social-world distributions encountered in the original DBLP data set, while the queries implement meaningful requests on top of this data, covering a variety of SPARQL operator constellationsand RDF access patterns. As a proof of concept, we apply SP^2Bench to existing engines and discuss their strengths and weaknesses that follow immediately from the benchmark results

[1]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[2]  Claudio Gutiérrez,et al.  The Expressive Power of SPARQL , 2008, SEMWEB.

[3]  Axel Polleres,et al.  From SPARQL to rules (and back) , 2007, WWW '07.

[4]  Vassilis Christophides,et al.  Benchmarking RDF Schemas for the Semantic Web , 2002, SEMWEB.

[5]  Shiyong Lu,et al.  Semantics Preserving SPARQL-to-SQL Query Translation for Optional Graph Patterns. Technical Report T , 2006 .

[6]  Jeff Heflin,et al.  Rapid Benchmarking for Semantic Web Knowledge Base Systems , 2005, SEMWEB.

[7]  Olaf Hartig,et al.  The SPARQL Query Graph Model for Query Optimization , 2007, ESWC.

[8]  Daniel J. Abadi,et al.  Using The Barton Libraries Dataset As An RDF benchmark , 2007 .

[9]  Vassilis Christophides,et al.  On Storing Voluminous RDF Descriptions: The Case of Web Portal Catalogs , 2001, WebDB.

[10]  Daniel J. Abadi,et al.  Scalable Semantic Web Data Management Using Vertical Partitioning , 2007, VLDB.

[11]  Georg Lausen,et al.  SP2Bench: A SPARQL Performance Benchmark , 2008, Semantic Web Information Management.

[12]  Vassilis Christophides,et al.  Ieee Transactions on Knowledge and Data Engineering on Graph Features of Semantic Web Schemas , 2022 .

[13]  Nigel Shadbolt,et al.  Resource Description Framework (RDF) , 2009 .

[14]  Georg Lausen,et al.  An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario , 2008, SEMWEB.

[15]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[16]  Marcelo Arenas,et al.  Semantics and complexity of SPARQL , 2006, TODS.

[17]  Nicholas Gibbins,et al.  3store: Efficient Bulk RDF Storage , 2003, PSSS.

[18]  Christian Bizer,et al.  The Berlin SPARQL Benchmark , 2009, Int. J. Semantic Web Inf. Syst..

[19]  Dongwon Lee,et al.  On six degrees of separation in DBLP-DB and more , 2005, SGMD.

[20]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[21]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[22]  Gerhard Weikum,et al.  RDF-3X: a RISC-style engine for RDF , 2008, Proc. VLDB Endow..

[23]  Michael Schmidt,et al.  Foundations of SPARQL query optimization , 2008, ICDT '10.

[24]  Volker Linnemann,et al.  Using an index of precomputed joins in order to speed up SPARQL processing , 2007, ICEIS.

[25]  Martin L. Kersten,et al.  Column-store support for RDF data management: not all swans are white , 2008, Proc. VLDB Endow..

[26]  Georg Lausen,et al.  SPARQLing constraints for RDF , 2008, EDBT '08.

[27]  Andreas Harth,et al.  Optimized index structures for querying RDF from the Web , 2005, Third Latin American Web Congress (LA-WEB'2005).

[28]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[29]  David J. DeWitt,et al.  The oo7 Benchmark , 1993, SIGMOD Conference.

[30]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[31]  Vassilis Christophides,et al.  Benchmarking Database Representations of RDF/S Stores , 2005, SEMWEB.

[32]  Abraham Bernstein,et al.  Hexastore: sextuple indexing for semantic web data management , 2008, Proc. VLDB Endow..

[33]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[34]  Jim Gray,et al.  The Benchmark Handbook for Database and Transaction Systems , 1993 .

[35]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[36]  Dave Reynolds,et al.  SPARQL basic graph pattern optimization using selectivity estimation , 2008, WWW.

[37]  Richard Cyganiak,et al.  A relational algebra for SPARQL , 2005 .

[38]  Alfred J. Lotka,et al.  The frequency distribution of scientific productivity , 1926 .