Interactive online machine learning approach for activity-travel survey

Introduction Activity-travel survey methods with tracking devices have been developed since the late 1990s as effective methods to collect behavioural data (e.g. Asakura and Hato, 2004, and Draijer et al., 2000). In these surveys, trajectories of survey participants are automatically collected by mobile instruments such as Global Positioning System (GPS). Internet Web-based diaries synchronized with the data from mobile instruments are used for complementing the detailed information on trips and activities. Comparing with traditional surveys such as Person Trip surveys and paper-based diary surveys, the mobile instruments improve observation period and resolutions in both space and time dimensions. However, even if such mobile instruments are applied to a survey, survey participants are required to manually input the detailed activity-travel information because the information obtained from the instruments does not directly contain activity-travel attributes and contexts such as a trip purpose and travel mode. It means that a lot of time and efforts are required for the participants as the survey period becomes longer. As a consequence, number of participants in the most of tracking surveys remains less than a thousand, and survey duration is less than a few months in previous studies (e.g. Asakura and Hato, 2004, and Draijer et al., 2000). It is still difficult to collect day-to-day data for continuous long-term periods via these surveys because of cost, processing load, accuracy, and privacy protection of respondents. Several previous studies have attempted to develop methods to automatically complement behavioural contexts to the data obtained from the mobile instruments. For example, Shen and Stopher (2013) developed a trip purpose imputation method for GPS data by using the National Household Travel Survey (NHTS) in the US. Cottrill et al. (2013) tried to automatically estimate travel attributes on a Web-based diary system from mobile instruments data. These conventional methods rely on predetermined parameters of discriminant functions for trip purposes and travel modes. It means that these methods require preceding data acquisition to derive relation between behavioural contexts and observed data. However, these relations can be variable depending on lifestyles and surrounding environment of survey participants. It would be preferable that the relations are updated depending on the situation during survey period. This study proposes a framework of interactive activity-travel survey system implemented on mobile devices such as smartphones. In order to adapt the method for long-term activity-travel survey, the system employs an online machine learning method for adapting an estimation model, as well as an online estimation method for trip purposes. Comparing with conventional travel diary surveys, the proposed system is expected to reduce frequency where survey participants are required to input information. On the other hand, as different form conventional offline estimation method, the proposed system can automatically update and adapt the estimation model to current situations of the participants. Methodology and Model The proposed survey system consists of three process, “move-or-stay identification”, “travel context estimation” and “learning”. The move-or-stay identification process is used for automatically identifying points and time of start/end of travels. In the travel context estimation process, travel context such as trip purposes are estimated from GPS and sensor data immediately after detecting the end of the travel. In the learning process, survey participants are asked to answer the actual travel contexts, and the estimation model is updated by using the answers. This process is randomly generated according to confidence level of an estimation result. If the model parameters are improved by this process, the frequency of inquiry about actual contexts is expected to be reduced. When confidence of an estimation result is high enough, the participants are not frequently requested to respond their actual activities because the estimation model has already been well estimated. In this study, trip purposes are estimated as travel contexts. We employ the method proposed by Asakura and Hato (2004) as one of the simplest methods of the move-or-stay identification process. The naive Bayes classifier (Rish, 2001) is employed as travel context estimation. This method requires smaller amount of data comparing with other machine learning models. Empirical Analysis The empirical analysis consists of two stages. At the first stage, the proposed model is examined by using actual behavioural data obtained from a previous Probe Person survey. This validation analysis uses the Web-based diary data obtained from Probe Person survey conducted from 17 th October 2007 to 3 rd February 2008 in the Matsuyama city in Japan. There were 197 participants in the survey. Each participant joined the survey for 14 days in maximum. To validate the proposed method, simulation analysis was conducted using the data. At the second stage, a pilot survey was conducted using an actual system implemented on smartphones to validate how the proposed method behaves in the actual survey situations. Results As a result of the first stage analysis, the number of trips of which actual purpose was required to be input was decreased by 59%. The trip purpose was correctly obtained in 89% trips. These results suggest that the proposed method is applicable to reduce amount of the inputting information during a travel-activity survey while keeping quality of obtained data. References Asakura, Y. and Hato, E., 2004. Tracking individual travel behaviour using mobile communication instruments. Transportation Research Part C 12, 273–291. Cottrill, C. D., Pereira, F. C., Zhao, F., Dias, I. F., Lim, H. B., Ben-Akiva, M. E., Zegras, P. C, 2013. Future mobility survey. Transportation Research Record: Journal of the Transportation Research Board, Vol. 2354, pp. 59–67. Draijer, G., Kalfs, N., Perdok, J., 2000. Global positioning system as a data collection method for travel research. Transportation Research Record 1719, 147–153. Rish, I., 2001. An empirical study of the naive Bayes classifier. IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Seattle, Washington, USA, 4–10 August 2001. Shen, L. and Stopher, P. R., 2013. A process for trip purpose imputation from Global Positioning System data, Transportation Research Part C: Emerging Technologies, Vol. 36, pp. 261–267.

[1]  Peter R. Stopher,et al.  A process for trip purpose imputation from Global Positioning System data , 2013 .

[2]  Yasuo Asakura,et al.  Behavioural data mining of transit smart card data: A data fusion approach , 2014 .

[3]  E. Murakami,et al.  Can using global positioning system (GPS) improve trip reporting , 1999 .

[4]  D. Hensher,et al.  Bayesian imputation of non-chosen attribute values in revealed preference surveys , 2009 .

[5]  Toshiyuki Yamamoto,et al.  Deriving Personal Trip Data from GPS Data: A Literature Review on the Existing Methodologies , 2014 .

[6]  Eiji Hato,et al.  Development of MoALs (Mobile Activity Loggers supported by gps-phones) for travel behavior analysis , 2006 .

[7]  Yasuo Asakura,et al.  TRACKING SURVEY FOR INDIVIDUAL TRAVEL BEHAVIOUR USING MOBILE COMMUNICATION INSTRUMENTS , 2004 .

[8]  Hjp Harry Timmermans,et al.  Comparison of advanced imputation algorithms for detection of transportation mode and activity episode using GPS data , 2016 .

[9]  Satish V. Ukkusuri,et al.  Understanding urban human activity and mobility patterns using large-scale location-based data from online social media , 2013, UrbComp '13.

[10]  Satoshi Fujii,et al.  The effectiveness of panels in detecting changes in discrete travel behavior , 2003 .

[11]  Peter Vovsha,et al.  Evaluation of Two Methods for Identifying Trip Purpose in GPS-Based Household Travel Surveys , 2014 .

[12]  Francisco C. Pereira,et al.  Activity Recognition for a Smartphone Based Travel Survey Based on Cross-User History Data , 2014, 2014 22nd International Conference on Pattern Recognition.

[13]  Nelly Kalfs,et al.  Global Positioning System as Data Collection Method for Travel Research , 2000 .

[14]  Chandra R. Bhat,et al.  A New Estimation Approach for the Multiple Discrete-Continuous Probit (MDCP) Choice Model , 2013 .

[15]  Randall Guensler,et al.  Elimination of the Travel Diary: Experiment to Derive Trip Purpose from Global Positioning System Travel Data , 2001 .

[16]  Kay W. Axhausen,et al.  Fatigue in long-duration travel diaries , 2007 .

[17]  Kay W. Axhausen,et al.  Processing Raw Data from Global Positioning Systems without Additional Information , 2009 .

[18]  Harry Timmermans,et al.  Mobile Technologies for Activity-Travel Data Collection and Analysis , 2014 .

[19]  Takuya Maruyama,et al.  Behavioural data collection using mobile phones , 2014 .

[20]  Ashish Bhaskar,et al.  Validation Study of Naïve Bayes Probabilistic Model for Transit Passengers’ Trip Purpose Estimation: Case Study Exploiting Detailed Brisbane Household Travel Survey Data , 2016 .

[21]  Joshua Auld,et al.  An automated GPS-based prompted recall survey with learning algorithms , 2009 .

[22]  Marta C. González,et al.  Origin-destination trips by purpose and time of day inferred from mobile phone data , 2015 .

[23]  Peter R. Stopher,et al.  Review of GPS Travel Survey and GPS Data-Processing Methods , 2014 .

[24]  Agachai Sumalee,et al.  Statistical approach for activity-based model calibration based on plate scanning and traffic counts data , 2015 .

[25]  Keemin Sohn,et al.  Activity imputation for trip-chains elicited from smart-card data using a continuous hidden Markov model , 2016 .

[26]  T. Arentze,et al.  A need-based model of multi-day, multi-person activity generation , 2009 .

[27]  Moshe Ben-Akiva,et al.  Future Mobility Survey , 2013 .

[28]  Ryuichi Kitamura,et al.  Panel Analysis in Transportation Planning: An Overview , 1990 .

[29]  Satish V. Ukkusuri,et al.  Urban activity pattern classification using topic models from online geo-location data , 2014 .

[30]  Irina Rish,et al.  An empirical study of the naive Bayes classifier , 2001 .

[31]  Guangnian Xiao,et al.  Detecting trip purposes from smartphone-based travel surveys with artificial neural networks and particle swarm optimization , 2016 .