Managing massive trajectories on the cloud

With advances in location-acquisition techniques, such as GPS- embedded phones, an enormous volume of trajectory data is generated, by people, vehicles, and animals. This trajectory data is one of the most important data sources in many urban computing applications, e.g., traffic modeling, user profiling analysis, air quality inference, and resource allocation. To utilize large scale trajectory data efficiently and effectively, cloud computing platforms, e.g., Microsoft Azure, are the most convenient and economic way. However, traditional cloud computing platforms are not designed to deal with spatio-temporal data, such as trajectories. To this end, we design and implement a holistic cloud-based trajectory data management system on Microsoft Azure to bridge the gap between trajectory data and urban applications. Our system can efficiently store, index, and query large trajectory data with three functions: 1) trajectory ID-temporal query, 2) trajectory spatio-temporal query, and 3) trajectory mapmatching. The efficiency of the system is tested and tuned based on real-time trajectory data feeds. The system is currently used in many internal urban applications, as we will illustrate using case studies.

[1]  Walid G. Aref,et al.  Spatio-Temporal Access Methods: Part 2 (2003 - 2010) , 2010, IEEE Data Eng. Bull..

[2]  Ouri Wolfson,et al.  A weight-based map matching method in moving objects databases , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[3]  Lei Chen,et al.  Finding time period-based most frequent path in big trajectory data , 2013, SIGMOD '13.

[4]  Yuhong Li,et al.  Location selection for ambulance stations: a data-driven approach , 2015, SIGSPATIAL/GIS.

[5]  Yu Zheng,et al.  U-Air: when urban air quality inference meets big data , 2013, KDD.

[6]  J. Greenfeld MATCHING GPS OBSERVATIONS TO LOCATIONS ON A DIGITAL MAP , 2002 .

[7]  Yong Yu,et al.  Inferring gas consumption and pollution emission of vehicles throughout a city , 2014, KDD.

[8]  Jian Huang,et al.  Parallel Map Matching on Massive Vehicle GPS Data Using MapReduce , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[9]  Xing Xie,et al.  An Interactive-Voting Based Map Matching Algorithm , 2010, 2010 Eleventh International Conference on Mobile Data Management.

[10]  Aoying Zhou,et al.  Query processing of massive trajectory data based on mapreduce , 2009, CloudDB@CIKM.

[11]  Yu Zheng,et al.  Trajectory Data Mining , 2015, ACM Trans. Intell. Syst. Technol..

[12]  Nicholas Jing Yuan,et al.  On discovery of gathering patterns from trajectories , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[13]  Mohamed Sarwat,et al.  GeoSpark: a cluster computing framework for processing large-scale spatial data , 2015, SIGSPATIAL/GIS.

[14]  Dongyu Liu,et al.  SmartAdP: Visual Analytics of Large-scale Taxi Trajectories for Selecting Billboard Locations , 2017, IEEE Transactions on Visualization and Computer Graphics.

[15]  Abdeltawab M. Hendawi,et al.  Predictive tree: An efficient index for predictive queries on road networks , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[16]  Walid G. Aref,et al.  Spatio-Temporal Access Methods , 2003, IEEE Data Eng. Bull..

[17]  Ahmed Eldawy,et al.  A Demonstration of SpatialHadoop: An Efficient MapReduce Framework for Spatial Data , 2013, Proc. VLDB Endow..

[18]  Licia Capra,et al.  Urban Computing: Concepts, Methodologies, and Applications , 2014, TIST.

[19]  Ralf Hartmut Güting,et al.  Parallel Secondo: Boosting Database Engines with Hadoop , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[20]  Joel H. Saltz,et al.  Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce , 2013, Proc. VLDB Endow..

[21]  Yu Zheng,et al.  Travel time estimation of a path using sparse trajectories , 2014, KDD.

[22]  Le Gruenwald,et al.  Spatial Join Query Processing in Cloud: Analyzing Design Choices and Performance Comparisons , 2015, 2015 44th International Conference on Parallel Processing Workshops.

[23]  Oliver Pink,et al.  A statistical approach to map matching using road network geometry, topology and vehicular motion constraints , 2008, 2008 11th International IEEE Conference on Intelligent Transportation Systems.

[24]  Yanhua Li,et al.  Mining the Most Influential $k$ -Location Set from Massive Trajectories , 2016, IEEE Transactions on Big Data.

[25]  Amit P. Sheth,et al.  Semantic (Web) Technology In Action: Ontology Driven Information Systems for Search, Integration and Analysis , 2003, IEEE Data Eng. Bull..

[26]  Ming Li,et al.  Forecasting Fine-Grained Air Quality Based on Big Data , 2015, KDD.

[27]  J. Gerring A case study , 2011, Technology and Society.