Evaluating Geospatial RDF Stores Using the Benchmark Geographica 2

Since 2007, geospatial extensions of SPARQL, like GeoSPARQL and stSPARQL, have been defined and corresponding geospatial RDF stores have been implemented. In addition, some work on developing benchmarks for evaluating geospatial RDF stores has been carried out. In this paper, we revisit the Geographica benchmark defined by our group in 2013 which uses both real world and synthetic data to test the performance and functionality of geospatial RDF stores. We present Geographica 2, a new version of the benchmark which extends Geographica by adding one more workload, extending our existing workloads and evaluating 5 more RDF stores. Using three different real workloads, Geographica 2 tests the efficiency of primitive spatial functions in RDF stores and the performance of the RDF stores in real use case scenarios, a more detailed evaluation is performed using a synthetic workload and the scalability of the RDF stores is stressed with the scalability workload. In total eight systems are evaluated out of which six adequately support GeoSPARQL and two offer limited spatial support.

[1]  Manolis Koubarakis,et al.  Geographica: A Benchmark for Geospatial RDF Stores (Long Version) , 2013, SEMWEB.

[2]  Manolis Koubarakis,et al.  Strabon: A Semantic Geospatial DBMS , 2012, SEMWEB.

[3]  Antony I. T. Rowstron,et al.  Scale-up vs scale-out for Hadoop: time to rethink? , 2013, SoCC.

[4]  Saif M. Mohammad,et al.  Big , 2019, Proceedings of the 2019 Conference of the North.

[5]  Dave Kolas,et al.  Spatially-Augmented Knowledgebase , 2007, ISWC/ASWC.

[6]  George Papastefanatos,et al.  The EvoGen Benchmark Suite for Evolving RDF Data , 2016, MEPDaW/LDQ@ESWC.

[7]  Stefan Manegold,et al.  Real-time wildfire monitoring using scientific database and linked data technologies , 2013, EDBT '13.

[8]  Anthony G. Cohn,et al.  A Spatial Logic based on Regions and Connection , 1992, KR.

[9]  Norman W. Paton,et al.  VESPA: A Benchmark for Vector Spatial Databases , 2000, BNCOD.

[10]  Spiros Athanasiou,et al.  Towards GeoSpatial semantic data management: strengths, weaknesses, and challenges ahead , 2014, SIGSPATIAL/GIS.

[11]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[12]  CattellRick Scalable SQL and NoSQL data stores , 2011 .

[13]  David J. DeWitt,et al.  Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation , 1997, SIGMOD '97.

[14]  Oliver Günther,et al.  Benchmarking spatial joins a la carte , 1999, Int. J. Geogr. Inf. Sci..

[15]  Georg Lausen,et al.  SP2Bench: A SPARQL Performance Benchmark , 2008, Semantic Web Information Management.

[16]  Suprio Ray,et al.  Jackpine: A benchmark to evaluate spatial database performance , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[17]  George Papadakis,et al.  Big, Linked Geospatial Data and Its Applications in Earth Observation , 2017, IEEE Internet Computing.

[18]  P. Alam ‘G’ , 2021, Composites Engineering: An A–Z Guide.

[19]  M. Stonebraker,et al.  The Sequoia 2000 Benchmark , 1993, SIGMOD Conference.

[20]  Amit P. Sheth,et al.  A framework to support spatial, temporal and thematic analytics over semantic web data , 2008 .

[21]  Mark D. Hill,et al.  What is scalability? , 1990, CARN.

[22]  Jens Lehmann,et al.  DBpedia SPARQL Benchmark - Performance Assessment with Real Queries on Real Data , 2011, SEMWEB.

[23]  Jens Lehmann,et al.  LinkedGeoData: A core for a web of spatial open data , 2012, Semantic Web.

[24]  Pierfrancesco Bellini,et al.  Performance assessment of RDF graph databases for smart city services , 2018, J. Vis. Lang. Comput..

[25]  Hassan Chafi,et al.  The LDBC Social Network Benchmark: Interactive Workload , 2015, SIGMOD Conference.

[26]  Knut Stolze,et al.  SQL/MM Spatial - The Standard to Manage Spatial Data in a Relational Database System , 2003, BTW.

[27]  Georg Lausen,et al.  SP^2Bench: A SPARQL Performance Benchmark , 2008, 2009 IEEE 25th International Conference on Data Engineering.

[28]  Efi Karra Taniskidou,et al.  A PILOT FOR BIG DATA EXPLOITATION IN THE SPACE AND SECURITY DOMAIN , 2016 .

[29]  Alexandru Iosup,et al.  A Survey of Benchmarks for Graph-Processing Systems , 2018, Graph Data Management.

[30]  Max J. Egenhofer,et al.  A Formal Definition of Binary Topological Relationships , 1989, FODO.

[31]  Jussi Myllymaki,et al.  DynaMark: A Benchmark for Dynamic Spatial Indexing , 2003, Mobile Data Management.

[32]  Michael Hausenblas,et al.  A Performance and Scalability Metric for Virtual RDF Graphs , 2007, SFSW.

[33]  Javier García,et al.  Experimental evaluation of horizontal and vertical scalability of cluster-based application servers for transactional workloads , 2008 .

[34]  Andre B. Bondi,et al.  Characteristics of scalability and their impact on performance , 2000, WOSP '00.

[35]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.

[36]  Manolis Koubarakis,et al.  Data Models and Query Languages for Linked Geospatial Data , 2012, Reasoning Web.

[37]  Rick Cattell,et al.  Scalable SQL and NoSQL data stores , 2011, SGMD.

[38]  M. Tamer Özsu,et al.  Diversified Stress Testing of RDF Data Management Systems , 2014, SEMWEB.

[39]  Konstantina Bereta,et al.  Representation and Querying of Valid Time of Triples in Linked Geospatial Data , 2013, ESWC.

[40]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[41]  Atsuhiro Takasu,et al.  An Efficient Distributed Index for Geospatial Databases , 2015, DEXA.

[42]  Bernhard Mitschang,et al.  Deep integration of spatial query processing into native RDF triple stores , 2010, GIS '10.

[43]  Srini Ramaswamy,et al.  Understanding Scalability Issues for a Distributed Simulation Environment Using Intelligent Coordinated Entities , 2005, Software Engineering Research and Practice.

[44]  Dave Kolas,et al.  Enabling the geospatial Semantic Web with Parliament and GeoSPARQL , 2012, Semantic Web.

[45]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[46]  Christian Bizer,et al.  The Berlin SPARQL Benchmark , 2009, Int. J. Semantic Web Inf. Syst..

[47]  Mark McKenney Geometric and thematic integration of spatial data into maps , 2010, 2010 IEEE International Conference on Information Reuse & Integration.