Collective spatial keyword queries: a distance owner-driven approach

Recently, spatial keyword queries become a hot topic in the literature. One example of these queries is the collective spatial keyword query (CoSKQ) which is to find a set of objects in the database such that it covers a set of given keywords collectively and has the smallest cost. Unfortunately, existing exact algorithms have severe scalability problems and existing approximate algorithms, though scalable, cannot guarantee near-to-optimal solutions. In this paper, we study the CoSKQ problem and address the above issues. Firstly, we consider the CoSKQ problem using an existing cost measurement called the maximum sum cost. This problem is called MaxSum-CoSKQ and is known to be NP-hard. We observe that the maximum sum cost of a set of objects is dominated by at most three objects which we call the distance owners of the set. Motivated by this, we propose a distance owner-driven approach which involves two algorithms: one is an exact algorithm which runs faster than the best-known existing algorithm by several orders of magnitude and the other is an approximate algorithm which improves the best-known constant approximation factor from 2 to 1.375. Secondly, we propose a new cost measurement called diameter cost and CoSKQ with this measurement is called Dia-CoSKQ. We prove that Dia-CoSKQ is NP-hard. With the same distance owner-driven approach, we design two algorithms for Dia-CoSKQ: one is an exact algorithm which is efficient and scalable and the other is an approximate algorithm which gives a √3-factor approximation. We conducted extensive experiments on real datasets which verified that the proposed exact algorithms are scalable and the proposed approximate algorithms return near-to-optimal solutions.

[1]  Ketan Mulmuley,et al.  Computational geometry : an introduction through randomized algorithms , 1993 .

[2]  Esther M. Arkin,et al.  Minimum-diameter covering problems , 2000, Networks.

[3]  Panos Kalnis,et al.  User oriented trajectory search for trip recommendation , 2012, EDBT '12.

[4]  CongGao,et al.  Retrieving top-k prestige-based relevant spatial web objects , 2010, VLDB 2010.

[5]  Torsten Suel,et al.  Text vs. space: efficient geo-search query processing , 2011, CIKM '11.

[6]  Christian S. Jensen,et al.  Retrieving top-k prestige-based relevant spatial web objects , 2010, Proc. VLDB Endow..

[7]  Hassan Masum,et al.  Review of Computational Geometry: Algorithms and Applications (2nd ed.) by Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried Schwarzkopf , 2000, SIGA.

[8]  Esther M. Arkin,et al.  Minimum-diameter covering problems , 2000 .

[9]  Christian S. Jensen,et al.  Efficient continuously moving top-k spatial keyword query processing , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[10]  Anthony K. H. Tung,et al.  Keyword Search in Spatial Databases: Towards Searching by Document , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[11]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[12]  Xing Xie,et al.  Hybrid index structures for location-based web search , 2005, CIKM '05.

[13]  Theodoros Lappas,et al.  Finding a team of experts in social networks , 2009, KDD.

[14]  Naphtali Rishe,et al.  Efficient and Scalable Method for Processing Top-k Spatial Boolean Queries , 2010, SSDBM.

[15]  Beng Chin Ooi,et al.  Collective spatial keyword querying , 2011, SIGMOD '11.

[16]  Christian S. Jensen,et al.  Joint Top-K Spatial Keyword Query Processing , 2012, IEEE Transactions on Knowledge and Data Engineering.

[17]  Mark Sanderson,et al.  Spatio-textual Indexing for Geographical Search on the Web , 2005, SSTD.

[18]  Xiaofeng Meng,et al.  Co-spatial Searcher: Efficient Tag-Based Collaborative Spatial Search on Geo-social Network , 2012, DASFAA.

[19]  Ken C. K. Lee,et al.  IR-Tree: An Efficient Index for Geographic Document Search , 2011, IEEE Trans. Knowl. Data Eng..

[20]  Jiaheng Lu,et al.  Reverse spatial and textual k nearest neighbor search , 2011, SIGMOD '11.

[21]  Anthony K. H. Tung,et al.  Locating mapped resources in Web 2.0 , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[22]  Alok Aggarwal,et al.  Finding k Points with Minimum Diameter and Related Problems , 1991, J. Algorithms.

[23]  Naphtali Rishe,et al.  Keyword Search on Spatial Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[24]  Timothy M. Chan A dynamic data structure for 3-D convex hulls and 2-D nearest neighbor queries , 2010, J. ACM.

[25]  Christian S. Jensen,et al.  Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects , 2009, Proc. VLDB Endow..

[26]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[27]  Beng Chin Ooi,et al.  Efficient Spatial Keyword Search in Trajectory Databases , 2012, ArXiv.

[28]  João B. Rocha-Junior,et al.  Top-k spatial keyword queries on road networks , 2012, EDBT '12.

[29]  Weiwei Sun,et al.  Circle of Friend Query in Geo-Social Networks , 2012, DASFAA.

[30]  João B. Rocha-Junior,et al.  Efficient Processing of Top-k Spatial Keyword Queries , 2011, SSTD.