Aalborg Universitet Trajectory Similarity Join in Spatial Networks

The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider the case of trajectory similarity join (TS-Join), where the objects are trajectories of vehicles moving in road networks. Thus, given two sets of trajectories and a threshold θ, the TS-Join returns all pairs of trajectories from the two sets with similarity above θ. This join targets applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide a purposeful definition of similarity. To enable efficient TS-Join processing on large sets of trajectories, we develop search space pruning techniques and take into account the parallel processing capabilities of modern processors. Specifically, we present a two-phase divideand-conquer algorithm. For each trajectory, the algorithm first finds similar trajectories. Then it merges the results to achieve a final result. The algorithm exploits an upper bound on the spatiotemporal similarity and a heuristic scheduling strategy for search space pruning. The algorithm’s per-trajectory searches are independent of each other and can be performed in parallel, and the merging has constant cost. An empirical study with real data offers insight in the performance of the algorithm and demonstrates that is capable of outperforming a well-designed baseline algorithm by an order of magnitude.

[1]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[2]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[3]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[4]  Marios Hadjieleftheriou,et al.  Efficient trajectory joins using symbolic representations , 2005, MDM '05.

[5]  Dieter Pfoser,et al.  On Map-Matching Vehicle Tracking Data , 2005, VLDB.

[6]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[7]  Dieter Pfoser,et al.  Addressing the Need for Map-Matching Speed: Localizing Global Curve-Matching Algorithms , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[8]  Petko Bakalov,et al.  Continuous Spatiotemporal Trajectory Joins , 2008, GSN.

[9]  Hui Ding,et al.  Efficient Similarity Join of Large Sets of Moving Object Trajectories , 2008, 2008 15th International Symposium on Temporal Representation and Reasoning.

[10]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[11]  Jignesh M. Patel,et al.  Design and evaluation of trajectory join algorithms , 2009, GIS.

[12]  Heng Tao Shen,et al.  Searching trajectories by locations: an efficiency study , 2010, SIGMOD Conference.

[13]  Panos Kalnis,et al.  User oriented trajectory search for trip recommendation , 2012, EDBT '12.

[14]  Panos Kalnis,et al.  Personalized trajectory matching in spatial networks , 2014, The VLDB Journal.

[15]  Lei Chen,et al.  Finding time period-based most frequent path in big trajectory data , 2013, SIGMOD '13.

[16]  Wen-Syan Li,et al.  String Similarity Joins: An Experimental Evaluation , 2014, Proc. VLDB Endow..

[17]  Panos Kalnis,et al.  Discovery of Path Nearby Clusters in Spatial Networks , 2015, IEEE Transactions on Knowledge and Data Engineering.

[18]  Panos Kalnis,et al.  Collective Travel Planning in Spatial Networks , 2016, IEEE Transactions on Knowledge and Data Engineering.

[19]  Guoliang Li,et al.  Signature-Based Trajectory Similarity Join , 2017, IEEE Transactions on Knowledge and Data Engineering.

[20]  Vaibhav Muddebihalkar,et al.  Searching Trajectories by Regions of Interest , 2018 .