Efficient and robust data augmentation for trajectory analytics: a similarity-based approach

Trajectories between the same origin and destination (OD) offer valuable information for us to better understand the diversity of moving behaviours and the intrinsic relationships between the moving objects and specific locations. However, due to the data sparsity issue, there are always insufficient trajectories to carry out mining algorithms, e.g., classification and clustering, to discover the intrinsic properties of OD mobility. In this work, we propose an efficient and robust trajectory augmentation approach to construct sizeable qualified trajectories with existing data to address the sparsity issue. The high-level idea is to concatenate existing trajectories to reconstruct a sufficient number of trajectories to represent the ones going across the OD pair directly. To achieve this goal, we first propose a transition graph to support efficient sub-trajectories concatenation to tackle the sparsity issue. In addition, we develop a novel similarity metric to measure the similarity between two set of trajectories so as to validate whether the reconstructed trajectory set can well represent the original traces. Empirical studies on a large real trajectory dataset show that our proposed solutions are efficient and robust.

[1]  Wen-Chih Peng,et al.  Discovering pattern-aware routes from trajectories , 2013, Distributed and Parallel Databases.

[2]  Xing Xie,et al.  Destination prediction by sub-trajectory synthesis and privacy protection against such prediction , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[3]  Jianwen Su,et al.  Shapes based trajectory queries for moving objects , 2005, GIS '05.

[4]  Jae-Gil Lee,et al.  TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering , 2008, Proc. VLDB Endow..

[5]  Xiaofang Zhou,et al.  Origin-Destination Trajectory Diversity Analysis: Efficient Top-k Diversified Search , 2018, 2018 19th IEEE International Conference on Mobile Data Management (MDM).

[6]  P. A. Taylor,et al.  Synchronization of batch trajectories using dynamic time warping , 1998 .

[7]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[8]  Joachim M. Buhmann,et al.  Non-parametric similarity measures for unsupervised texture segmentation and image retrieval , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[10]  Luis González Abril,et al.  Trip destination prediction based on past GPS log using a Hidden Markov Model , 2010, Expert Syst. Appl..

[11]  Heng Tao Shen,et al.  Discovering popular routes from trajectories , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[12]  Yu Zheng,et al.  Constructing popular routes from uncertain trajectories , 2012, KDD.

[13]  Christian S. Jensen,et al.  Travel Cost Inference from Sparse, Spatio-Temporally Correlated Time Series Using Markov Models , 2013, Proc. VLDB Endow..

[14]  Daqing Zhang,et al.  From taxi GPS traces to social and community dynamics , 2013, ACM Comput. Surv..

[15]  J. Kruskal An Overview of Sequence Comparison: Time Warps, String Edits, and Macromolecules , 1983 .

[16]  Xiaofang Zhou,et al.  Trajectory Set Similarity Measure: An EMD-Based Approach , 2018, ADC.

[17]  Yu Zheng,et al.  Travel time estimation of a path using sparse trajectories , 2014, KDD.

[18]  Jian Dai,et al.  Personalized route recommendation using big trajectory data , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[19]  John Krumm,et al.  Hidden Markov map matching through noise and sparseness , 2009, GIS.

[20]  Shazia Wasim Sadiq,et al.  An Effectiveness Study on Trajectory Similarity Measures , 2013, ADC.

[21]  Han Su Quality-aware trajectory processing using significant locations , 2015 .

[22]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[23]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[24]  Heng Tao Shen,et al.  Searching trajectories by locations: an efficiency study , 2010, SIGMOD Conference.

[25]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[26]  Arthur C. Sanderson,et al.  Pattern Trajectory Analysis of Nonstationary Multivariate Data , 1980, IEEE Transactions on Systems, Man, and Cybernetics.

[27]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[28]  Nikos Pelekis,et al.  Similarity Search in Trajectory Databases , 2007, 14th International Symposium on Temporal Representation and Reasoning (TIME'07).

[29]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[30]  Michael Werman,et al.  Fast and robust Earth Mover's Distances , 2009, 2009 IEEE 12th International Conference on Computer Vision.