Computing Trajectory Similarity in Linear Time: A Generic Seed-Guided Neural Metric Learning Approach

Trajectory similarity computation is a fundamental problem for various applications in trajectory data analysis. However, the high computation cost of existing trajectory similarity measures has become the key bottleneck for trajectory analysis at scale. While there have been many research efforts for reducing the complexity, they are specific to one similarity measure and often yield limited speedups. We propose NeuTraj to accelerate trajectory similarity computation. NeuTraj is generic to accommodate any existing trajectory measure and fast to compute the similarity of a given trajectory pair in linear time. Furthermore, NeuTraj is elastic to collaborate with all spatial-based trajectory indexing methods to reduce the search space. NeuTraj samples a number of seed trajectories from the given database, and then uses their pair-wise similarities as guidance to approximate the similarity function with a neural metric learning framework. NeuTraj features two novel modules to achieve accurate approximation of the similarity function: (1) a spatial attention memory module that augments existing recurrent neural networks for trajectory encoding; and (2) a distance-weighted ranking loss that effectively transcribes information from the seed-based guidance. With these two modules, NeuTraj can yield high accuracies and fast convergence rates even if the training data is small. Our experiments on two real-life datasets show that NeuTraj achieves over 80% accuracy on Fre chet, Hausdorff, ERP and DTW measures, which outperforms state-of-the-art baselines consistently and significantly. It obtains 50x-1000x speedup over bruteforce methods and 3x-500x speedup over existing approximate algorithms, while yielding more accurate approximations of the similarity functions.

[1]  Nikolaos Papanikolopoulos,et al.  Clustering of Vehicle Trajectories , 2010, IEEE Transactions on Intelligent Transportation Systems.

[2]  Helmut Alt,et al.  Computing the Fréchet distance between two polygonal curves , 1995, Int. J. Comput. Geom. Appl..

[3]  Xianyuan Zhan,et al.  Dynamics of functional failures and recovery in complex road networks. , 2017, Physical review. E.

[4]  Frank Staals,et al.  Clustering Trajectories for Map Construction , 2017, SIGSPATIAL/GIS.

[5]  Feifei Li,et al.  Distributed Trajectory Similarity Search , 2017, Proc. VLDB Endow..

[6]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[7]  Heng Tao Shen,et al.  Searching trajectories by locations: an efficiency study , 2010, SIGMOD Conference.

[8]  Henri Casanova,et al.  Distance Threshold Similarity Searches: Efficient Trajectory Indexing on the GPU , 2016, IEEE Transactions on Parallel and Distributed Systems.

[9]  Sukho Lee,et al.  OMT: Overlap Minimizing Top-down Bulk Loading Algorithm for R-tree , 2003, CAiSE Short Paper Proceedings.

[10]  Francesco Silvestri,et al.  Locality-Sensitive Hashing of Curves , 2017, Symposium on Computational Geometry.

[11]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[12]  Kyle Fox,et al.  Approximating Dynamic Time Warping and Edit Distance for a Pair of Point Sequences , 2015, SoCG.

[13]  Panos Kalnis,et al.  Trajectory Similarity Join in Spatial Networks , 2017, Proc. VLDB Endow..

[14]  Pascal Vincent,et al.  Hierarchical Memory Networks , 2016, ArXiv.

[15]  Göran Falkman,et al.  Online Learning and Sequential Anomaly Detection in Trajectories. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[16]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[17]  Christian S. Jensen,et al.  Deep Representation Learning for Trajectory Similarity Computation , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[18]  Anastasios Sidiropoulos,et al.  Constant-Distortion Embeddings of Hausdorff Metrics into Constant-Dimensional l_p Spaces , 2016, APPROX-RANDOM.

[19]  Laurens van der Maaten,et al.  Modeling Time Series Similarity with Siamese Recurrent Networks , 2016, ArXiv.

[20]  Xing Xie,et al.  GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory , 2010, IEEE Data Eng. Bull..

[21]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[22]  Lei Chen,et al.  Fast Similarity Search of Multi-Dimensional Time Series via Segment Rotation , 2015, DASFAA.

[23]  Piotr Indyk,et al.  Approximate nearest neighbor algorithms for Hausdorff metrics via embeddings , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[24]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[25]  Jason Weston,et al.  Memory Networks , 2014, ICLR.

[26]  Rong Jin,et al.  Fine-grained visual categorization via multi-stage metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Gert R. G. Lanckriet,et al.  Metric Learning to Rank , 2010, ICML.

[28]  Michel Ferreira,et al.  Time-evolving O-D matrix estimation using high-speed GPS data streams , 2016, Expert Syst. Appl..

[29]  Vladik Kreinovich,et al.  Arbitrary nonlinearity is sufficient to represent all functions by neural networks: A theorem , 1991, Neural Networks.

[30]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[31]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[32]  Alexander J. Smola,et al.  Sampling Matters in Deep Embedding Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.