Spatial search strategies for open government data: a systematic comparison

The increasing availability of open government datasets on the Web calls for ways to enable their efficient access and searching. There is however an overall lack of understanding regarding spatial search strategies which would perform best in this context. To address this gap, this work has assessed the impact of different spatial search strategies on performance and user relevance judgment. We harvested machine-readable spatial datasets and their metadata from three English-based open government data portals, performed metadata enhancement, developed a prototype and performed both a theoretical and user-based evaluation. The results highlight that (i) switching between area of overlap and Hausdorff distance for spatial similarity computation does not have any substantial impact on performance; and (ii) the use of Hausdorff distance induces slightly better user relevance ratings than the use of area of overlap. The data collected and the insights gleaned may serve as a baseline against which future work can compare.

[1]  Jock D. Mackinlay,et al.  The information visualizer, an information workspace , 1991, CHI.

[2]  James A. Hendler,et al.  TWC LOGD: A portal for linked open government data ecosystems , 2011, J. Web Semant..

[3]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.

[4]  Daqing He,et al.  Challenges and Supports for Accessing Open Government Datasets: Data Guide for Better Open Data Access and Uses , 2019, CHIIR.

[5]  Dan Brickley,et al.  Google Dataset Search: Building a search engine for datasets in an open Web ecosystem , 2019, WWW.

[6]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[7]  Christian Kray,et al.  Designing a Semantic API for Open City Data , 2016 .

[8]  Javier Nogueras-Iso,et al.  Aggregation-based information retrieval system for geospatial data catalogs , 2017, Int. J. Geogr. Inf. Sci..

[9]  James A. Hendler,et al.  From international open government dataset search to discovery: a semantic web service approach , 2012, ICEGOV.

[10]  Auriol Degbelo,et al.  Roadblocks Hindering the Reuse of Open Geodata in Colombia and Spain: A Data User's Perspective , 2017, ISPRS Int. J. Geo Inf..

[11]  Werner Kuhn,et al.  Improving Discovery of Open Civic Data , 2018, GIScience.

[12]  Auriol Degbelo,et al.  Tell Me How My Open Data Is Re-used: Increasing Transparency Through the Open City Toolkit , 2019, Open Cities | Open Data.

[13]  Roger Zimmermann,et al.  Relevance ranking in georeferenced video search , 2009, Multimedia Systems.

[14]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[15]  Yannis Charalabidis,et al.  A taxonomy of open government data research areas and topics , 2016, J. Organ. Comput. Electron. Commer..

[16]  Elena Paslaru Bontas Simperl,et al.  The Trials and Tribulations of Working with Structured Data: -a Study on Information Seeking Behaviour , 2017, CHI.

[17]  A. R. Rivas,et al.  Study of Query Expansion Techniques and Their Application in the Biomedical Information Retrieval , 2014, TheScientificWorldJournal.

[18]  Christian Kray,et al.  Designing Semantic Application Programming Interfaces for Open Government Data , 2016 .

[19]  Sebastian Neumaier,et al.  Enabling Spatio-Temporal Search in Open Data , 2019, J. Web Semant..

[20]  Dirk Burghardt,et al.  International Journal of Geographical Information Science , 2022 .

[21]  Alexis J. Comber,et al.  Creating a conceptual framework to improve the re‐usability of open geographic data in cities , 2018, Trans. GIS.

[22]  Christopher B. Jones,et al.  Geographic Information Retrieval: Progress and Challenges in Spatial Search of Text , 2018, Found. Trends Inf. Retr..

[23]  Peter A. Johnson,et al.  The Cost(s) of Geospatial Open Data , 2017, Trans. GIS.

[24]  Akshay Deepak,et al.  Query Expansion Techniques for Information Retrieval: a Survey , 2017, Inf. Process. Manag..

[25]  Stefano De Sabbata,et al.  Criteria of geographic relevance: an experimental study , 2012, Int. J. Geogr. Inf. Sci..

[26]  Sara Irina Fabrikant,et al.  Assessing geographic relevance for mobile search: A computational model and its validation via crowdsourcing , 2016, J. Assoc. Inf. Sci. Technol..

[27]  Jia Song,et al.  Similarity Measurement of Metadata of Geospatial Data: An Artificial Neural Network Approach , 2018, ISPRS Int. J. Geo Inf..

[28]  Ray R. Larson Ranking approaches for GIR , 2011, SIGSPACIAL.

[29]  Christian Kray,et al.  Feature-centric ranking algorithms for georeferenced video search , 2017, SIGSPATIAL/GIS.

[30]  Marijn Janssen,et al.  Improving the speed and ease of open data use through metadata, interaction mechanisms, and quality indicators , 2016, J. Organ. Comput. Electron. Commer..

[31]  Jian-Yun Nie,et al.  Diversified query expansion using conceptnet , 2013, CIKM.

[32]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[33]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[34]  Elena Paslaru Bontas Simperl,et al.  Characterising dataset search - An analysis of search logs and data requests , 2019, J. Web Semant..

[35]  Hsin-Hsi Chen,et al.  Combining WordNet and ConceptNet for Automatic Query Expansion: A Learning Approach , 2008, AIRS.

[36]  Auriol Degbelo,et al.  Linked Data and Visualization: Two Sides of the Transparency Coin , 2017, UrbanGIS@SIGSPATIAL.

[37]  Thomas S. Huang,et al.  A Smart Web-Based Geospatial Data Discovery System with Oceanographic Data as an Example , 2018, ISPRS Int. J. Geo Inf..

[38]  Mandar Mitra,et al.  Improving query expansion using WordNet , 2013, J. Assoc. Inf. Sci. Technol..

[39]  Barry Nalebuff,et al.  An Introduction to Vote-Counting Schemes , 1995 .

[40]  Lois M. L. Delcambre,et al.  Discounted Cumulated Gain Based Evaluation of Multiple-Query IR Sessions , 2008, ECIR.

[41]  Guoray Cai Relevance ranking in Geographical Information Retrieval , 2011, SIGSPACIAL.

[42]  Patrick Hosein,et al.  A consumer focused open data platform , 2016, 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC).

[43]  Stefano Mizzaro,et al.  How many relevances in information retrieval? , 1998, Interact. Comput..

[44]  Robert B. Miller,et al.  Response time in man-computer conversational transactions , 1899, AFIPS Fall Joint Computing Conference.

[45]  Fabio Gomes de Andrade,et al.  Enabling Spatial Queries in Open Government Data Portals , 2017, EGOVIS.

[46]  Ray R. Larson,et al.  A comparison of geometric approaches to assessing spatial similarity for GIR , 2008, Int. J. Geogr. Inf. Sci..

[47]  Hsin-Hsi Chen,et al.  Query Expansion with ConceptNet and WordNet: An Intrinsic Comparison , 2006, AIRS.

[48]  James A. Hendler,et al.  Data-gov Wiki: Towards Linking Government Data , 2010, AAAI Spring Symposium: Linked Data Meets Artificial Intelligence.

[49]  Christian Kray,et al.  Intelligent geovisualizations for open government data (vision paper) , 2018, SIGSPATIAL/GIS.