SPgen : A Benchmark Generator for Spatial Link Discovery Tools

A number of real and synthetic benchmarks have been proposed for evaluating the performance of link discovery systems. So far, only a limited number of link discovery benchmarks target the problem of linking geo-spatial entities. However, some of the largest knowledge bases of the Linked Open Data Web, such as LinkedGeoData contain vast amounts of spatial information. Several systems that manage spatial data and consider the topology of the spatial resources and the topological relations between them have been developed. In order to assess the ability of these systems to handle the vast amount of spatial data and perform the much needed data integration in the Linked Geo Data Cloud, it is imperative to develop benchmarks for geo-spatial link discovery. In this paper we propose the Spatial Benchmark Generator \(SPgen \) that can be used to test the performance of link discovery systems which deal with topological relations as proposed in the state of the art DE-9IM (Dimensionally Extended nine-Intersection Model). \(SPgen \) implements all topological relations of DE-9IM between LineStrings and Polygons in the two-dimensional space. A comparative analysis with benchmarks produced using \(SPgen \) to assess and identify the capabilities of AML, OntoIdea, RADON and Silk spatial link discovery systems is provided.

[1]  Éric Gaussier,et al.  A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation , 2005, ECIR.

[2]  Thomas Neumann,et al.  TPC-H Analyzed: Hidden Messages and Lessons Learned from an Influential Benchmark , 2013, TPCTC.

[3]  Manolis Koubarakis,et al.  Geographica: A Benchmark for Geospatial RDF Stores (Long Version) , 2013, SEMWEB.

[4]  Cosmin Stroe,et al.  AgreementMaker: Efficient Matching for Large Real-World Schemas and Ontologies , 2009, Proc. VLDB Endow..

[5]  Manolis Koubarakis,et al.  Discovering Spatial and Temporal Links among RDF Data , 2016, LDOW@WWW.

[6]  Oliver Günther,et al.  Benchmarking spatial joins a la carte , 1998, Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243).

[7]  Norman W. Paton,et al.  VESPA: A Benchmark for Vector Spatial Databases , 2000, BNCOD.

[8]  Abderrahmane Khiat,et al.  I-Match and OntoIdea results for OAEI 2017 , 2017, OM@ISWC.

[9]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[10]  David J. DeWitt,et al.  Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation , 1997, SIGMOD '97.

[11]  Suprio Ray,et al.  Jackpine: A benchmark to evaluate spatial database performance , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[12]  M. Stonebraker,et al.  The Sequoia 2000 Benchmark , 1993, SIGMOD Conference.

[13]  StonebrakerMichael,et al.  The SEQUOIA 2000 storage benchmark , 1993 .

[14]  Ioannis Konstantinou,et al.  Spaten: A spatio-temporal and textual big data generator , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[15]  Axel-Cyrille Ngonga Ngomo,et al.  Pushing the Limits of Instance Matching Systems: A Semantics-Aware Benchmark for Linked Data , 2015, WWW.

[16]  Axel-Cyrille Ngonga Ngomo,et al.  On Link Discovery using a Hybrid Approach , 2012, Journal on Data Semantics.

[17]  PatelJignesh,et al.  Building a scaleable geo-spatial DBMS , 1997 .

[18]  Cosmin Stroe,et al.  Using AgreementMaker to align ontologies for OAEI 2010 , 2010, OM.

[19]  Axel-Cyrille Ngonga Ngomo,et al.  Radon - Rapid Discovery of Topological Relations , 2017, AAAI.

[20]  Christian Strobl Dimensionally Extended Nine-Intersection Model (DE-9IM) , 2008, Encyclopedia of GIS.