Trajectory Similarity Join in Spatial Networks

The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider the case of trajectory similarity join (TS-Join), where the objects are trajectories of vehicles moving in road networks. Thus, given two sets of trajectories and a threshold θ, the TS-Join returns all pairs of trajectories from the two sets with similarity above θ. This join targets applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide a purposeful definition of similarity. To enable efficient TS-Join processing on large sets of trajectories, we develop search space pruning techniques and take into account the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer algorithm. For each trajectory, the algorithm first finds similar trajectories. Then it merges the results to achieve a final result. The algorithm exploits an upper bound on the spatiotemporal similarity and a heuristic scheduling strategy for search space pruning. The algorithm's per-trajectory searches are independent of each other and can be performed in parallel, and the merging has constant cost. An empirical study with real data offers insight in the performance of the algorithm and demonstrates that is capable of outperforming a well-designed baseline algorithm by an order of magnitude.

[1]  Petko Bakalov,et al.  Continuous Spatiotemporal Trajectory Joins , 2008, GSN.

[2]  Nicholas Jing Yuan,et al.  Towards efficient search for activity trajectories , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[3]  Lei Chen,et al.  Finding time period-based most frequent path in big trajectory data , 2013, SIGMOD '13.

[4]  Marios Hadjieleftheriou,et al.  Efficient trajectory joins using symbolic representations , 2005, MDM '05.

[5]  Panos Kalnis,et al.  Discovery of Path Nearby Clusters in Spatial Networks , 2015, IEEE Transactions on Knowledge and Data Engineering.

[6]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[7]  Panos Kalnis,et al.  Personalized trajectory matching in spatial networks , 2014, The VLDB Journal.

[8]  Panos Kalnis,et al.  Collective Travel Planning in Spatial Networks , 2016, IEEE Transactions on Knowledge and Data Engineering.

[9]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[10]  Panos Kalnis,et al.  Searching Trajectories by Regions of Interest , 2017, IEEE Transactions on Knowledge and Data Engineering.

[11]  Jignesh M. Patel,et al.  Design and evaluation of trajectory join algorithms , 2009, GIS.

[12]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[13]  Hui Ding,et al.  Efficient Similarity Join of Large Sets of Moving Object Trajectories , 2008, 2008 15th International Symposium on Temporal Representation and Reasoning.

[14]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[15]  Heng Tao Shen,et al.  Searching trajectories by locations: an efficiency study , 2010, SIGMOD Conference.

[16]  Dieter Pfoser,et al.  Addressing the Need for Map-Matching Speed: Localizing Global Curve-Matching Algorithms , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[17]  Wen-Syan Li,et al.  String Similarity Joins: An Experimental Evaluation , 2014, Proc. VLDB Endow..

[18]  Guoliang Li,et al.  Signature-Based Trajectory Similarity Join , 2017, IEEE Transactions on Knowledge and Data Engineering.

[19]  Panos Kalnis,et al.  User oriented trajectory search for trip recommendation , 2012, EDBT '12.

[20]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[21]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[22]  Dieter Pfoser,et al.  On Map-Matching Vehicle Tracking Data , 2005, VLDB.