Inverted linear quadtree: Efficient top k spatial keyword search

With advances in geo-positioning technologies and geo-location services, there are a rapidly growing amount of spatio-textual objects collected in many applications such as location based services and social networks, in which an object is described by its spatial location and a set of keywords (terms). Consequently, the study of spatial keyword search which explores both location and textual description of the objects has attracted great attention from the commercial organizations and research communities. In the paper, we study the problem of top k spatial keyword search (TOPK-SK), which is fundamental in the spatial keyword queries. Given a set of spatio-textual objects, a query location and a set of query keywords, the top k spatial keyword search retrieves the closest k objects each of which contains all keywords in the query. Based on the inverted index and the linear quadtree, we propose a novel index structure, called inverted linear quadtree (IL-Quadtree), which is carefully designed to exploit both spatial and keyword based pruning techniques to effectively reduce the search space. An efficient algorithm is then developed to tackle top k spatial keyword search. In addition, we show that the IL-Quadtree technique can also be applied to improve the performance of other spatial keyword queries such as the direction-aware top k spatial keyword search and the spatio-textual ranking query. Comprehensive experiments on real and synthetic data clearly demonstrate the efficiency of our methods.

[1]  Christos Faloutsos,et al.  Multiattribute hashing using Gray codes , 1986, SIGMOD '86.

[2]  Irene Gargantini,et al.  An effective way to represent quadtrees , 1982, CACM.

[3]  Ken C. K. Lee,et al.  IR-Tree: An Efficient Index for Geographic Document Search , 2011, IEEE Trans. Knowl. Data Eng..

[4]  Naphtali Rishe,et al.  Efficient and Scalable Method for Processing Top-k Spatial Boolean Queries , 2010, SSDBM.

[5]  João B. Rocha-Junior,et al.  Top-k spatial keyword queries on road networks , 2012, EDBT '12.

[6]  João B. Rocha-Junior,et al.  Efficient Processing of Top-k Spatial Keyword Queries , 2011, SSTD.

[7]  Anthony K. H. Tung,et al.  Scalable top-k spatial keyword search , 2013, EDBT '13.

[8]  Yang Wang,et al.  Location-aware publish/subscribe , 2013, KDD.

[9]  Senjuti Basu Roy,et al.  Location-aware type ahead search on spatial databases: semantics and efficiency , 2011, SIGMOD '11.

[10]  Xing Xie,et al.  Hybrid index structures for location-based web search , 2005, CIKM '05.

[11]  Torsten Suel,et al.  Text vs. space: efficient geo-search query processing , 2011, CIKM '11.

[12]  Christian S. Jensen,et al.  Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects , 2009, Proc. VLDB Endow..

[13]  Chen Li,et al.  Processing Spatial-Keyword (SK) Queries in Geographic Information Retrieval (GIR) Systems , 2007, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007).

[14]  Herbert B. Enderton,et al.  A mathematical introduction to logic , 1972 .

[15]  Naphtali Rishe,et al.  Keyword Search on Spatial Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[16]  Torsten Suel,et al.  Batch query processing for web search engines , 2011, WSDM '11.

[17]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[18]  Christian S. Jensen,et al.  Spatial Keyword Query Processing: An Experimental Evaluation , 2013, Proc. VLDB Endow..

[19]  Christian S. Jensen,et al.  A framework for efficient spatial web object retrieval , 2012, The VLDB Journal.

[20]  Jing Xu,et al.  DESKS: Direction-Aware Spatial Keyword Search , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[21]  Muhammad Aamir Cheema,et al.  Diversified Spatial Keyword Search On Road Networks , 2014, EDBT.

[22]  Christian S. Jensen,et al.  Joint Top-K Spatial Keyword Query Processing , 2012, IEEE Transactions on Knowledge and Data Engineering.