Embedding-Based Similarity Computation for Massive Vehicle Trajectory Data

Trajectory similarity computation is a fundamental problem for many intelligent applications such as trip trajectory mining to find the most popular routes and road anomaly detection. Existing similarity computation methods for vehicle trajectory, such as Dynamic Time Wrapping(DTW) and Frechet distance, are dynamic programming problems with a quadratic time complexity in all cases and need to handle local time shifts when computing the distance between trajectories, thus limiting the scalability of these methods. In addition, GPS-based vehicle trajectories usually contain errors such as noise and outliers due to poor satellite visibility in urban regions and non-uniform sampling rates. To this end, we propose an embedding-based method for trajectory similarity computation with linear time complexity, which encodes a trajectory as a vector via deep representation learning and learns the similarity between trajectories with an attention-based learning to rank model. We use an interpolation-based approach to reduce noise and outliers by considering vehicle trajectory is physically constrained to the road network. Experiments on a massive vehicle trajectory dataset show that the proposed approach outperforms state-of-the-art baselines consistently and significantly.