A Data-Driven Approach for GPS Trajectory Data Cleaning

In this paper, we study the problem of GPS trajectory data cleaning, aiming to clean the noises in trajectory data. The noises can be generated due to many factors, such as GPS devices failure, sensor error, transmission error and storage error. Existing cleaning algorithms usually focus on certain types of noises and have many limitations in applications. In this paper, we propose a data-driven approach to clean the noises by exploiting historical trajectory point cloud. We extract road information from the historical trajectories and use such information to detect and correct the noises. As compared to map matching techniques, our method does not have many requirements such as high sampling rates, and it is robust to noises and nonuniform distribution in the historical trajectory point cloud. Extensive experiments are conducted on real datasets to demonstrate that the proposed approach can effectively clean the noises while not utilizing any map information.

[1]  J. Shane Culpepper,et al.  Torch: A Search Engine for Trajectory Data , 2018, SIGIR.

[2]  Sofiane Abbar,et al.  Robust Road Map Inference through Network Alignment of Trajectories , 2018, SDM.

[3]  Xing Xie,et al.  T-drive: driving directions based on taxi trajectories , 2010, GIS '10.

[4]  Can Yang,et al.  Fast map matching, an algorithm integrating hidden Markov model with precomputation , 2018, Int. J. Geogr. Inf. Sci..

[5]  J. Shane Culpepper,et al.  Fast Large-Scale Trajectory Clustering , 2019, Proc. VLDB Endow..

[6]  Timos K. Sellis,et al.  Sampling Trajectory Streams with Spatiotemporal Criteria , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[7]  Kai Zheng,et al.  Calibrating trajectory data for similarity-based analysis , 2013, SIGMOD '13.

[8]  John Krumm,et al.  Hidden Markov map matching through noise and sparseness , 2009, GIS.

[9]  Ralf Hartmut Güting,et al.  Indexing the Trajectories of Moving Objects in Networks* , 2004, GeoInformatica.

[10]  Daqing Zhang,et al.  From taxi GPS traces to social and community dynamics , 2013, ACM Comput. Surv..

[11]  Maurice van Keulen,et al.  Point of interest to region of interest conversion , 2013, SIGSPATIAL/GIS.

[12]  Guoliang Li,et al.  Distributed In-memory Trajectory Similarity Search and Join on Road Network , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[13]  Jae-Gil Lee,et al.  Trajectory Outlier Detection: A Partition-and-Detect Framework , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[14]  Mingyan Liu,et al.  Surface street traffic estimation , 2007, MobiSys '07.

[15]  Xiaoli Wang,et al.  Noise filtering, trajectory compression and trajectory segmentation on GPS data , 2016, 2016 11th International Conference on Computer Science & Education (ICCSE).

[16]  J. Shane Culpepper,et al.  The Maximum Trajectory Coverage Query in Spatial Databases , 2018, Proc. VLDB Endow..

[17]  Pradeep K. Atrey,et al.  GeoSClean: Secure Cleaning of GPS Trajectory Data Using Anomaly Detection , 2018, 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR).

[18]  Kyle Steinfeld,et al.  Spatial Behaviors of Individuals in Cities: Case Studies in Data Tracking and Scaling , 2016, Urb-IoT.

[19]  Xiangdong Wang,et al.  Attack trajectory extraction based on focus player detection in broadcast sports videos , 2015, ICIMCS '15.

[20]  Nenghai Yu,et al.  Trajectory simplification method for location-based social networking services , 2009, LBSN '09.

[21]  Deng Pan,et al.  Belief-based cleaning in trajectory sensor streams , 2012, 2012 IEEE International Conference on Communications (ICC).

[22]  Wang-Chien Lee,et al.  Trajectory Preprocessing , 2011, Computing with Spatial Trajectories.

[23]  Zhifeng Bao,et al.  DITA: Distributed In-Memory Trajectory Analytics , 2018, SIGMOD Conference.

[24]  Aoying Zhou,et al.  Distributed top-k similarity query on big trajectory streams , 2018, Frontiers of Computer Science.