Searching activity trajectory with keywords

Driven by the advances in location positioning techniques and the popularity of location sharing services, semantic enriched trajectory data has become unprecedentedly available. While finding relevant Point-of-Interests (PoIs) based on users’ locations and query keywords has been extensively studied in the past years, it is, however, largely untouched to explore the keyword queries in the context of activity trajectory database. In this paper, we study the problem of searching activity trajectories by keywords. Given a set of query keywords, a keyword-oriented query for activity trajectory (KOAT) returns k trajectories that contain the most relevant keywords to the query and yield the least travel effort in the meantime. The main difference between KOAT and conventional spatial keyword queries is that there is no query location in KOAT, which means the search area cannot be localized. To capture the travel effort in the context of query keywords, a novel score function, called spatio-textual ranking function, is first defined. Then we develop a hybrid index structure called GiKi to organize the trajectories hierarchically, which enables pruning the search space by spatial and textual similarity simultaneously. Finally an efficient search algorithm and fast evaluation of the value of spatio-textual ranking function are proposed. In addition, we extend the proposed techniques of KOAT to support range-based query and order sensitive query, which can be applied for more practical applications. The results of our empirical studies based on real check-in datasets demonstrate that our proposed index and algorithms can achieve good scalability.

[1]  Wen-Syan Li,et al.  String Similarity Joins: An Experimental Evaluation , 2014, Proc. VLDB Endow..

[2]  Jin Wang,et al.  Two birds with one stone: An efficient hierarchical framework for top-k and threshold-based string similarity search , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[3]  Christian S. Jensen,et al.  Efficient continuously moving top-k spatial keyword query processing , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[4]  Liu Xiao-ying Fast Subsequence Matching in Time-series Database , 2008 .

[5]  Nicholas Jing Yuan,et al.  On discovery of gathering patterns from trajectories , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[6]  Anthony K. H. Tung,et al.  Scalable top-k spatial keyword search , 2013, EDBT '13.

[7]  Christian S. Jensen,et al.  Retrieving top-k prestige-based relevant spatial web objects , 2010, Proc. VLDB Endow..

[8]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[9]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[10]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[11]  Raymond Chi-Wing Wong,et al.  Exact Top-k Nearest Keyword Search in Large Networks , 2015, SIGMOD Conference.

[12]  Feifei Li,et al.  Approximate string search in spatial databases , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[13]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[14]  Tao Guo,et al.  Efficient Algorithms for Answering the m-Closest Keywords Query , 2015, SIGMOD Conference.

[15]  Jiaheng Lu,et al.  Space-Constrained Gram-Based Indexing for Efficient Approximate String Search , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[16]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[17]  Nicholas Jing Yuan,et al.  Towards efficient search for activity trajectories , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[18]  Xing Xie,et al.  Reducing Uncertainty of Low-Sampling-Rate Trajectories , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[19]  Yufei Tao,et al.  Continuous Nearest Neighbor Search , 2002, VLDB.

[20]  Trilce Estrada,et al.  Time Series Join on Subsequence Correlation , 2014, 2014 IEEE International Conference on Data Mining.

[21]  Bin Wang,et al.  VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams , 2007, VLDB.

[22]  Vaibhav Muddebihalkar,et al.  Searching Trajectories by Regions of Interest , 2018 .

[23]  Guoliang Li,et al.  A partition-based method for string similarity joins with edit-distance constraints , 2013, TODS.

[24]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[25]  Anthony K. H. Tung,et al.  Efficient and Effective KNN Sequence Search with Approximate n-grams , 2013, Proc. VLDB Endow..

[26]  Raymond T. Ng,et al.  Indexing spatio-temporal trajectories with Chebyshev polynomials , 2004, SIGMOD '04.

[27]  Sunita Sarawagi,et al.  Efficient set joins on similarity predicates , 2004, SIGMOD '04.

[28]  João B. Rocha-Junior,et al.  Top-k spatial keyword queries on road networks , 2012, EDBT '12.

[29]  Christian S. Jensen,et al.  Spatial Keyword Query Processing: An Experimental Evaluation , 2013, Proc. VLDB Endow..

[30]  Beng Chin Ooi,et al.  Collective spatial keyword querying , 2011, SIGMOD '11.

[31]  Man Lung Yiu,et al.  Discovering Longest-lasting Correlation in Sequence Databases , 2013, Proc. VLDB Endow..

[32]  Bin Wang,et al.  Local Filtering: Improving the Performance of Approximate Queries on String Collections , 2015, SIGMOD Conference.

[33]  Luis Gravano,et al.  Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.

[34]  Shazia Wasim Sadiq,et al.  Efficient Retrieval of Top-K Most Similar Users from Travel Smart Card Data , 2014, 2014 IEEE 15th International Conference on Mobile Data Management.

[35]  Christian S. Jensen,et al.  Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects , 2009, Proc. VLDB Endow..

[36]  Dieter Pfoser,et al.  Novel Approaches to the Indexing of Moving Object Trajectories , 2000, VLDB.

[37]  Guoliang Li,et al.  A pivotal prefix based filtering algorithm for string similarity search , 2014, SIGMOD Conference.

[38]  Xing Xie,et al.  Hybrid index structures for location-based web search , 2005, CIKM '05.

[39]  Nicholas Jing Yuan,et al.  Approximate keyword search in semantic trajectory database , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[40]  Panos Kalnis,et al.  Trajectory Similarity Join in Spatial Networks , 2017, Proc. VLDB Endow..

[41]  Jiaheng Lu,et al.  Reverse spatial and textual k nearest neighbor search , 2011, SIGMOD '11.

[42]  Anthony K. H. Tung,et al.  Keyword Search in Spatial Databases: Towards Searching by Document , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[43]  Kai Zheng,et al.  Probabilistic range queries for uncertain trajectories on road networks , 2011, EDBT/ICDT '11.

[44]  Heng Tao Shen,et al.  Searching trajectories by locations: an efficiency study , 2010, SIGMOD Conference.

[45]  Panos Kalnis,et al.  Personalized trajectory matching in spatial networks , 2014, The VLDB Journal.

[46]  Naphtali Rishe,et al.  Keyword Search on Spatial Databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[47]  Christian S. Jensen,et al.  Discovery of convoys in trajectory databases , 2008, Proc. VLDB Endow..

[48]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[49]  Anthony K. H. Tung,et al.  Locating mapped resources in Web 2.0 , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[50]  Jing Xu,et al.  DESKS: Direction-Aware Spatial Keyword Search , 2012, 2012 IEEE 28th International Conference on Data Engineering.