A Data-Driven Method for Trip Ends Identification Using Large-Scale Smartphone-Based GPS Tracking Data

Using tracking data obtained from the smartphone and Internet survey, a data-driven machine learning method is proposed to identify trip ends. In previous literature, this is usually done based on some predefined rules, which have been confirmed to be valid. Nonetheless, these rule-based methods largely depend on researchers’ own knowledge, which is inevitably subjective and arbitrary. Moreover, they are not effective enough to process the huge amount of data in the era of big data. In this paper, millions of smartphone-based GPS tracking data are targeted. A group of attributes, such as travel speed, distance, and heading, are derived to characterize the smartphone holders’ travel status. In other words, the tracking points could be identified as being at the state of traveling or non-traveling, based on which the trip ends are easily detected. In contrast to those rule-based methods, a random forest is utilized in this paper as the classification model, with no subjective rules predefined for classification. This data-driven model is automatically built. The results show that after training the GPS tracking data of 1393 days and the prompted recall (PR) survey data using the random forest, the accuracy of trip ends identification on tracking data of 697 days is 96.17%. The current analysis is free from personal experiences, which is expected to be useful for the smartphone-based survey data in the era of big data.

[1]  Guangnian Xiao,et al.  Travel mode detection based on GPS track data and Bayesian networks , 2015, Comput. Environ. Urban Syst..

[2]  Eiji Hato,et al.  Effectiveness of Household Travel Survey Using GPS-Equipped Cell Phones and Web Diary: Comparative Study with Paper-Based Travel Survey , 2006 .

[3]  Hadley Wickham,et al.  ggmap: Spatial Visualization with ggplot2 , 2013, R J..

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Kay W. Axhausen,et al.  Processing Raw Data from Global Positioning Systems without Additional Information , 2009 .

[6]  Eui-Hwan Chung,et al.  A Trip Reconstruction Tool for GPS-based Personal Travel Surveys , 2005 .

[7]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[8]  Peter R. Stopher,et al.  Search for a global positioning system device to measure person travel , 2008 .

[9]  Mahmoud Mesbah,et al.  A trip-detection method for smartphone-assisted travel data collection , 2016 .

[10]  Waldin Stone,et al.  Automated transportation transfer detection using GPS enabled smartphones , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[11]  E. Murakami,et al.  Can using global positioning system (GPS) improve trip reporting , 1999 .

[12]  Peter R. Stopher,et al.  Processing GPS data from travel surveys , 2005 .

[13]  Jean Louise Wolf,et al.  Using GPS data loggers to replace travel diaries in the collection of travel data , 2000 .

[14]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[15]  D F Pearson,et al.  Comparison of Trip Determination Methods in Household Travel Surveys Enhanced by a Global Positioning System , 2005 .

[16]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[17]  Kees Maat,et al.  Deriving and validating trip purposes and travel modes for multi-day GPS-based travel surveys: A large-scale application in the Netherlands , 2009 .

[18]  Yasuo Asakura,et al.  TRACKING SURVEY FOR INDIVIDUAL TRAVEL BEHAVIOUR USING MOBILE COMMUNICATION INSTRUMENTS , 2004 .

[19]  Hjp Harry Timmermans,et al.  Comparison of advanced imputation algorithms for detection of transportation mode and activity episode using GPS data , 2016 .

[20]  Peter R. Stopher,et al.  Review of GPS Travel Survey and GPS Data-Processing Methods , 2014 .

[21]  Randall Guensler,et al.  Elimination of the Travel Diary: Experiment to Derive Trip Purpose from Global Positioning System Travel Data , 2001 .

[22]  Vu Duong,et al.  A Constructive Intelligent Transportation System for Urban Traffic Network in Developing Countries via GPS Data from Multiple Transportation Modes , 2015, 2015 IEEE 18th International Conference on Intelligent Transportation Systems.

[23]  Randall Guensler,et al.  Accuracy of Global Positioning System for Determining Driver Performance Parameters , 2002 .

[24]  Joshua Auld,et al.  An automated GPS-based prompted recall survey with learning algorithms , 2009 .