A Novel Indexing Method for Spatial-Keyword Range Queries

Spatial-keyword queries are important for a wide range of applications that retrieve data based on a combination of keyword search and spatial constraints. However, efficient processing of spatial-keyword queries is not a trivial task because the combination of textual and spatial data results in a high-dimensional representation that is challenging to index effectively. To address this problem, in this paper, we propose a novel indexing scheme for efficient support of spatial-keyword range queries. At the heart of our approach lies a carefully-designed mapping of spatio-textual data to a two-dimensional (2D) space that produces compact partitions of spatio-textual data. In turn, the mapped 2D data can be indexed effectively by traditional spatial data structures, such as an R-tree. We propose bounds, theoretically proven for correctness, that lead to the design of a filter-and-refine algorithm that prunes the search space effectively. In this way, our approach for spatial-keyword range queries is readily applicable to any database system that provides spatial support. In our experimental evaluation, we demonstrate how our algorithm can be implemented over PostgreSQL and exploit its underlying spatial index provided by PostGIS, in order to process spatial-keyword range queries efficiently. Moreover, we show that our solution outperforms different competitor approaches.

[1]  Christian S. Jensen,et al.  Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects , 2009, Proc. VLDB Endow..

[2]  Christian S. Jensen,et al.  Spatial Keyword Query Processing: An Experimental Evaluation , 2013, Proc. VLDB Endow..

[3]  João B. Rocha-Junior,et al.  Efficient Processing of Top-k Spatial Keyword Queries , 2011, SSTD.

[4]  Jing Li,et al.  Spatial keyword search: a survey , 2019, GeoInformatica.

[5]  Walid G. Aref,et al.  FAST: Frequency-Aware Indexing for Spatio-Textual Data Streams , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[6]  Beng Chin Ooi,et al.  iDistance: An adaptive B+-tree based indexing method for nearest neighbor search , 2005, TODS.

[7]  Seung-won Hwang,et al.  Processing and Optimizing Main Memory Spatial-Keyword Queries , 2015, Proc. VLDB Endow..

[8]  Torsten Suel,et al.  Efficient query processing in geographic web search engines , 2006, SIGMOD Conference.

[9]  Torsten Suel,et al.  Text vs. space: efficient geo-search query processing , 2011, CIKM '11.

[10]  Xuemin Lin,et al.  AP-Tree: Efficiently support continuous spatial-keyword queries over stream , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[11]  Yu Zhang,et al.  ST-HBase: A Scalable Data Management System for Massive Geo-tagged Objects , 2013, WAIM.

[12]  Chen Li,et al.  Supporting location-based approximate-keyword queries , 2010, GIS '10.

[13]  Ken C. K. Lee,et al.  IR-Tree: An Efficient Index for Geographic Document Search , 2011, IEEE Trans. Knowl. Data Eng..

[14]  Beng Chin Ooi,et al.  Indexing the Distance: An Efficient Method to KNN Processing , 2001, VLDB.

[15]  Anthony K. H. Tung,et al.  Keyword Search in Spatial Databases: Towards Searching by Document , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[16]  Walid G. Aref,et al.  Efficient processing of window queries in the pyramid data structure , 1990, PODS '90.

[17]  Naphtali Rishe,et al.  Keyword Search on Spatial Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[18]  Christian S. Jensen,et al.  Location- and keyword-based querying of geo-textual data: a survey , 2021, The VLDB Journal.

[19]  Chen Li,et al.  Processing Spatial-Keyword (SK) Queries in Geographic Information Retrieval (GIR) Systems , 2007, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007).

[20]  Walid G. Aref,et al.  Scalable Processing of Spatial-Keyword Queries , 2019, Scalable Processing of Spatial-Keyword Queries.