Travel mode choice: a data fusion model using machine learning methods and evidence from travel diary survey data

ABSTRACT In this paper, we present a series of machine learning approaches for better understanding people’s travel mode choice. The widely used Logit model is dependent on the assumption that the utility items are independent, violating this assumption caused inconsistent parameter estimations and biased predictions. To improve the prediction accuracy of mode choice, this paper employs the data fusion model based on stacking strategy and proposes a hybrid model of the unsupervised Denoising Autoencoder (DAE) combining with the supervised Random Forest (RF). A variety of features that may impact mode choice behavior are ranked and selected by using the feature selection algorithms. The proposed model, which is particularly useful and powerful in the choice behavior analysis and outperforms other widely used classifiers, is verified by travel diary data from Germany and Switzerland. The results can be used for better understanding and effectively modeling of human travel mode choice behavior.

[1]  Stephen Graham Ritchie,et al.  TRANSPORTATION RESEARCH. PART C, EMERGING TECHNOLOGIES , 1993 .

[2]  K. Small A Discrete Choice Model for Ordered Alternatives , 1987 .

[3]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[4]  D. Levinson,et al.  Public transit, active travel, and the journey to school: a cross-nested logit analysis , 2017 .

[5]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[6]  Jay K. Dow,et al.  Multinomial probit and multinomial logit: a comparison of choice models for voting research , 2004 .

[7]  Jinjun Tang,et al.  Vehicle traffic delay prediction in ferry terminal based on Bayesian multiple models combination method , 2017 .

[8]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  Carlos F. Daganzo,et al.  Multinomial Probit: The Theory and its Application to Demand Forecasting. , 1980 .

[11]  Julian Hagenauer,et al.  A comparative study of machine learning classifiers for modeling travel mode choice , 2017, Expert Syst. Appl..

[12]  Shuo Wang,et al.  Overview of deep learning , 2016, 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC).

[13]  Huijun Sun,et al.  Tradable credits scheme and transit investment optimization for a two‐mode traffic network , 2016 .

[14]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[15]  Jianjun Wu,et al.  Activity-based trip chaining behavior analysis in the network under the parking fee scheme , 2019 .

[16]  J. Friedman Stochastic gradient boosting , 2002 .

[17]  Eiji Hato,et al.  Use of acceleration data for transportation mode prediction , 2014, Transportation.

[18]  Ke Zhang,et al.  An agent-based choice model for travel mode and departure time and its case study in Beijing , 2016 .

[19]  Ziyou Gao,et al.  Optimal urban expressway system in a transportation and land use interaction equilibrium framework , 2019, Transportmetrica A: Transport Science.

[20]  Daniel McFadden,et al.  Modelling the Choice of Residential Location , 1977 .

[21]  Harry Timmermans,et al.  An integrated Markov decision process and nested logit consumer response model of air ticket pricing , 2017 .

[22]  Chenfeng Xiong,et al.  A mixed Bayesian network for two-dimensional decision modeling of departure time and mode choice , 2018 .

[23]  Andrew Daly,et al.  Estimating choice models containing attraction variables , 1982 .

[24]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[25]  F. Koppelman,et al.  The generalized nested logit model , 2001 .

[26]  M. Conner,et al.  Methods to quantify variable importance: implications for the analysis of noisy ecological data. , 2009, Ecology.

[27]  Agostino Nuzzolo,et al.  Advanced public transport and intelligent transport systems: new modelling challenges , 2016 .

[28]  Yunlong Zhang,et al.  Travel Mode Choice Modeling with Support Vector Machines , 2008 .

[29]  Kay W. Axhausen,et al.  Precision of geocoded locations and network distance estimates , 2004 .

[30]  Hjp Harry Timmermans,et al.  Using ensembles of decision trees to predict transport mode choice decisions: effects on predictive success and uncertainty estimates , 2014 .

[31]  Tadahiro Taniguchi,et al.  Visualization of driving behavior using deep sparse autoencoder , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[32]  Lei Zhang,et al.  A High-Order Hidden Markov Model and Its Applications for Dynamic Car Ownership Analysis , 2018, Transp. Sci..

[33]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[34]  A. Papola Some development on the cross-nested logit model , 2004 .

[35]  Xin Yang,et al.  Optimizing the release of passenger flow guidance information in urban rail transit network via agent-based simulation , 2019, Applied Mathematical Modelling.

[36]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[37]  Kay W. Axhausen,et al.  Fatigue in long-duration travel diaries , 2007 .

[38]  Moshe Ben-Akiva,et al.  Discrete Choice Analysis: Theory and Application to Travel Demand , 1985 .

[39]  M. Helbich,et al.  Elderly travel frequencies and transport mode choices in Greater Rotterdam, the Netherlands , 2017 .

[40]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[41]  Cheng-Yuan Liou,et al.  Autoencoder for words , 2014, Neurocomputing.

[42]  Chi Xie,et al.  WORK TRAVEL MODE CHOICE MODELING USING DATA MINING: DECISION TREES AND NEURAL NETWORKS , 2002 .

[43]  Davy Janssens,et al.  Annotating mobile phone location data with activity purposes using machine learning algorithms , 2013, Expert Syst. Appl..

[44]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[45]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[46]  Hichem Omrani,et al.  Predicting Travel Mode of Individuals by Machine Learning , 2015 .

[47]  Wei Guo,et al.  The analysis of dynamic travel mode choice: a heterogeneous hidden Markov approach , 2015 .

[48]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[49]  Cinzia Cirillo,et al.  Dynamic Discrete Choice Models for Transportation , 2011 .

[50]  Rung-Ching Chen,et al.  A novel passenger flow prediction model using deep learning methods , 2017 .

[51]  D. McFadden,et al.  MIXED MNL MODELS FOR DISCRETE RESPONSE , 2000 .

[52]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[53]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[54]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[55]  Chenfeng Xiong,et al.  Decision tree method for modeling travel mode switching in a dynamic behavioral process , 2015 .

[56]  Xiqun Chen,et al.  Understanding ridesplitting behavior of on-demand ride services: An ensemble learning approach , 2017 .