Clue-based Spatio-textual Query

Along with the proliferation of online digital map and location-based service, very large POI (point of interest) databases have been constructed where a record corresponds to a POI with information including name, category, address, geographical location and other features. A basic spatial query in POI database is POI retrieval. In many scenarios, a user cannot provide enough information to pinpoint the POI except some clue. For example, a user wants to identify a cafe in a city visited many years ago. SHe cannot remember the name and address but she still recalls that "the cafe is about 200 meters away from a restaurant; and turning left at the restaurant there is a bakery 500 meters away, etc.". Intuitively, the clue, even partial and approximate, describes the spatio-textual context around the targeted POI. Motivated by this observation, this work investigates clue-based spatio-textual query which allows user providing clue, i.e., some nearby POIs and the spatial relationships between them, in POI retrieval. The objective is to retrieve k POIs from a POI database with the highest spatio-textual context similarities against the clue. This work has deliberately designed data-quality-tolerant spatio-textual context similarity metric to cope with various data quality problems in both the clue and the POI database. Through crossing valuation, the query accuracy is further enhanced by ensemble method. Also, this work has developed an index called roll-out-star R-tree (RSR-tree) to dramatically improve the query processing efficiency. The extensive tests on data sets from the real world have verified the superiority of our methods in all aspects.

[1]  Max J. Egenhofer,et al.  Spatial‐Scene Similarity Queries , 2008, Trans. GIS.

[2]  Yannis Manolopoulos,et al.  Advanced Signature Indexing for Multimedia and Web Applications , 2003, Advances in Database Systems.

[3]  Zhifeng Bao,et al.  Top-k Spatio-Textual Similarity Join , 2016, IEEE Transactions on Knowledge and Data Engineering.

[4]  Christian S. Jensen,et al.  Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects , 2009, Proc. VLDB Endow..

[5]  Yorick Wilks,et al.  Ontologies, taxonomies, thesauri:learning from texts , 2004 .

[6]  Sunil Prabhakar,et al.  Evaluating probabilistic queries over imprecise data , 2003, SIGMOD '03.

[7]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[8]  Dimitris Papadias,et al.  Multiway spatial joins , 2001, ACM Trans. Database Syst..

[9]  Ken C. K. Lee,et al.  IR-Tree: An Efficient Index for Geographic Document Search , 2011, IEEE Trans. Knowl. Data Eng..

[10]  Max J. Egenhofer,et al.  Query Processing in Spatial-Query-by-Sketch , 1997, J. Vis. Lang. Comput..

[11]  Nigel Shadbolt,et al.  Discovering Cross-language Links in Wikipedia through Semantic Relatedness , 2012, ECAI.

[12]  Kai Zheng,et al.  K-nearest neighbor search for fuzzy objects , 2010, SIGMOD Conference.

[13]  Yufei Tao,et al.  Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions , 2005, VLDB.

[14]  Xin Li,et al.  Best Keyword Cover Search , 2015, IEEE Transactions on Knowledge and Data Engineering.

[15]  Hanan Samet,et al.  Incremental distance join algorithms for spatial databases , 1998, SIGMOD '98.

[16]  Jun Hu,et al.  SEAL: Spatio-Textual Similarity Search , 2012, Proc. VLDB Endow..

[17]  Jimmy J. Lin,et al.  Partitioning strategies for spatio-textual similarity join , 2014, BigSpatial '14.

[18]  Naphtali Rishe,et al.  Keyword Search on Spatial Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[19]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[20]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[21]  Nikos Mamoulis,et al.  Spatio-textual similarity joins , 2012, Proc. VLDB Endow..