Towards an Open Extensible Framework for Empirical Benchmarking of Data Management Solutions: LITMUS

Developments in the context of Open, Big, and Linked Data have led to an enormous growth of structured data on the Web. To keep up with the pace of efficient consumption and management of the data at this rate, many Data Management Solutions There exists many efforts for benchmarking these domain specific DMSs, however, (i) reproducing these third party benchmarks is an extremely tedious task, and (ii) there is a lack of a common framework which enables and advocates the extensibility and re-usability of the benchmarks. We propose LITMUS, one such framework for benchmarking data management solutions. LITMUS will go beyond classical storage benchmarking frameworks by allowing for analysing the performance of DMSs across query languages. In this early stage doctoral work, we present the LITMUS concept as well as the considerations that led to its preliminary architecture, and progress reported so far in its realisation.

[1]  Marko A. Rodriguez,et al.  The Gremlin graph traversal machine and language (invited talk) , 2015, DBPL.

[2]  Raphaël Troncy,et al.  GERBIL: General Entity Annotator Benchmarking Framework , 2015, WWW.

[3]  Jens Lehmann,et al.  DBpedia SPARQL Benchmark - Performance Assessment with Real Queries on Real Data , 2011, SEMWEB.

[4]  Jan Van den Bussche,et al.  On the Power of SPARQL in Expressing Navigational Queries , 2015, Comput. J..

[5]  Josep-Lluís Larriba-Pey,et al.  Survey of Graph Database Performance on the HPC Scalable Graph Analysis Benchmark , 2010, WAIM Workshops.

[6]  Elena Cabrio,et al.  Question Answering over Linked Data (QALD-5) , 2014, CLEF.

[7]  Marko A. Rodriguez,et al.  A path algebra for multi-relational graphs , 2011, 2011 IEEE 27th International Conference on Data Engineering Workshops.

[8]  Raghunath Othayoth Nambiar,et al.  Transaction Processing Performance Council (TPC): State of the Council 2010 , 2010, TPCTC.

[9]  Guillermo Palma,et al.  GRAPHIUM: Visualizing Performance of Graph and RDF Engines on Linked Data , 2013, International Semantic Web Conference.

[10]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[11]  Jens Lehmann,et al.  LITMUS: An Open Extensible Framework for Benchmarking RDF Data Management Solutions , 2016, ArXiv.

[12]  Marcelo Arenas,et al.  Semantics and Complexity of SPARQL , 2006, International Semantic Web Conference.

[13]  Axel-Cyrille Ngonga Ngomo,et al.  HOBBIT: Holistic Benchmarking of Big Linked Data , 2016, ERCIM News.

[14]  Claudio Gutiérrez,et al.  The Expressive Power of SPARQL , 2008, SEMWEB.

[15]  Georg Lausen,et al.  SP^2Bench: A SPARQL Performance Benchmark , 2008, 2009 IEEE 25th International Conference on Data Engineering.

[16]  Markus Krötzsch,et al.  Reifying RDF: What Works Well With Wikidata? , 2015, SSWS@ISWC.

[17]  Josep-Lluís Larriba-Pey,et al.  The linked data benchmark council: a graph and RDF industry benchmarking effort , 2014, SGMD.

[18]  Georgios Balikas,et al.  An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition , 2015, BMC Bioinformatics.

[19]  Christian Bizer,et al.  The Berlin SPARQL Benchmark , 2009, Int. J. Semantic Web Inf. Syst..

[20]  Marko A. Rodriguez,et al.  The Graph Traversal Pattern , 2010, Graph Data Management.

[21]  Olaf Hartig,et al.  Reconciliation of RDF* and Property Graphs , 2014, ArXiv.

[22]  Amit P. Sheth,et al.  A Formal Graph Model for RDF and Its Implementation , 2016, ArXiv.

[23]  Carlos Rojas,et al.  Querying Wikidata: Comparing SPARQL, Relational and Graph Databases , 2016, SEMWEB.

[24]  M. Tamer Özsu,et al.  Diversified Stress Testing of RDF Data Management Systems , 2014, SEMWEB.

[25]  Toyotaro Suzumura,et al.  XGDBench: A benchmarking platform for graph stores in exascale clouds , 2012, 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings.