Gaussian Process Latent Class Choice Models

We present a Gaussian Process – Latent Class Choice Model (GP-LCCM) to integrate a nonparametric class of probabilistic machine learning within discrete choice models (DCMs). Gaussian Processes (GPs) are kernel-based algorithms that incorporate expert knowledge by assuming priors over latent functions rather than priors over parameters, which makes them more flexible in addressing nonlinear problems. By integrating a Gaussian Process within a LCCM structure, we aim at improving discrete representations of unobserved heterogeneity. The proposed model would assign individuals probabilistically to behaviorally homogeneous clusters (latent classes) using GPs and simultaneously estimate class-specific choice models by relying on random utility models. Furthermore, we derive and implement an Expectation-Maximization (EM) algorithm to jointly estimate/infer the hyperparameters of the GP kernel function and the classspecific choice parameters by relying on a Laplace approximation and gradient-based numerical optimization methods, respectively. The model is tested on two different mode choice applications and compared against different LCCM benchmarks. Results show that GP-LCCM allows for a more complex and flexible representation of heterogeneity and improves both in-sample fit and out-of-sample predictive power. Moreover, behavioral and economic interpretability is maintained at the class-specific choice model level while local interpretation of the latent classes can still be achieved, although the non-parametric characteristic of GPs lessens the transparency of the model.

[1]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[2]  Dipti Srinivasan,et al.  Neural Networks for Real-Time Traffic Signal Control , 2006, IEEE Transactions on Intelligent Transportation Systems.

[3]  David A. Hensher,et al.  Revealing additional dimensions of preference heterogeneity in a latent class mixed multinomial logit model , 2010 .

[4]  Michel Bierlaire,et al.  A systematic review of machine learning classification methodologies for modelling passenger mode choice , 2021 .

[5]  Jorge Nocedal,et al.  Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[6]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[7]  A. Rivlin,et al.  Economic Choices , 2001 .

[8]  Chenfeng Xiong,et al.  Decision tree method for modeling travel mode switching in a dynamic behavioral process , 2015 .

[9]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[10]  David J. C. MacKay,et al.  Variational Gaussian process classifiers , 2000, IEEE Trans. Neural Networks Learn. Syst..

[11]  Ch. Ravi Sekhar,et al.  Multimodal Choice Modeling Using Random Forest Decision Trees , 2016 .

[12]  Eleni I. Vlahogianni,et al.  Statistical methods versus neural networks in transportation research: Differences, similarities and some insights , 2011 .

[13]  Ying Sun,et al.  Gaussian Processes for Short-Term Traffic Volume Forecasting , 2010 .

[14]  Giulio Erberto Cantarella,et al.  Multilayer Feedforward Networks for Transportation Mode Choice Analysis: An Analysis and a Comparison with Random Utility Models , 2005 .

[15]  Eleni I. Vlahogianni,et al.  Temporal Evolution of Short‐Term Urban Traffic Flow: A Nonlinear Dynamics Approach , 2008, Comput. Aided Civ. Infrastructure Eng..

[16]  Feifeng Zheng,et al.  Forecasting urban traffic flow by SVR with continuous ACO , 2011 .

[17]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[18]  Bilal Farooq,et al.  ResLogit: A residual neural network logit model , 2019, 1912.10058.

[19]  Wei Wang,et al.  Incident detection algorithm based on partial least squares regression , 2008 .

[20]  Una-May O'Reilly,et al.  Machine learning or discrete choice models for car ownership demand estimation and prediction? , 2017, 2017 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS).

[21]  Bernardete Ribeiro,et al.  A Bayesian Additive Model for Understanding Public Transport Usage in Special Events , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  D. McFadden,et al.  MIXED MNL MODELS FOR DISCRETE RESPONSE , 2000 .

[23]  Fangchun Yang,et al.  Learning Transportation Mode Choice for Context-Aware Services with Directed-Graph-Guided Fused Lasso from GPS Trajectory Data , 2017, 2017 IEEE International Conference on Web Services (ICWS).

[24]  Francisco C. Pereira,et al.  Multi-Output Gaussian Processes for Crowdsourced Traffic Data Imputation , 2018, IEEE Transactions on Intelligent Transportation Systems.

[25]  Guillaume-Alexandre Bilodeau,et al.  Discriminative conditional restricted Boltzmann machine for discrete choice and latent variable modelling , 2017, ArXiv.

[26]  Dinesh Ambat Gopimatj Modeling heterogeneity in discrete choice processes: Application to travel demand. , 1997 .

[27]  Moshe Ben-Akiva,et al.  Methodological issues in modelling time-of-travel preferences , 2013 .

[28]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[29]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[30]  Chuan Ding,et al.  Synergistic effects of the built environment and commuting programs on commute mode choice , 2018, Transportation Research Part A: Policy and Practice.

[31]  Akshay Vij,et al.  Machine Learning Meets Microeconomics: The Case of Decision Trees and Discrete Choice , 2017, 1711.04826.

[32]  Gregor Stiglic,et al.  Local vs. Global Interpretability of Machine Learning Models in Type 2 Diabetes Mellitus Screening , 2019, KR4HC/ProHealth/TEAAM@AIME.

[33]  Peng Gao,et al.  Short-Term Traffic Flow Forecasting by Selecting Appropriate Predictions Based on Pattern Matching , 2018, IEEE Access.

[34]  Seiichi Kagaya,et al.  Development of Transport Mode Choice Model by Using Adaptive Neuro-Fuzzy Inference System , 2006 .

[35]  Kenneth Train,et al.  EM algorithms for nonparametric estimation of mixing distributions , 2008 .

[36]  Jian-Chuan Xian-Yu,et al.  Travel Mode Choice Analysis Using Support Vector Machines , 2011 .

[37]  Arash Jahangiri,et al.  Developing a Support Vector Machine (SVM) Classifier for Transportation Mode Identification by Using Mobile Phone Sensor Data , 2014 .

[38]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[39]  M. Bierlaire,et al.  Introduction to Disaggregate Demand Models , 2013 .

[40]  Antony Stathopoulos,et al.  Fuzzy Modeling Approach for Combined Forecasting of Urban Traffic Flow , 2008, Comput. Aided Civ. Infrastructure Eng..

[41]  K. Boyle,et al.  A guide to heterogeneity features captured by parametric and nonparametric mixing distributions for the mixed logit model , 2015 .

[42]  Rico Krueger,et al.  A Dirichlet Process Mixture Model of Discrete Choice , 2018, 1801.06296.

[43]  Dipti Srinivasan,et al.  DEVELOPMENT AND ADAPTATION OF CONSTRUCTIVE PROBABILISTIC NEURAL NETWORK IN FREEWAY INCIDENT DETECTION , 2002 .

[44]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[45]  Xin Jin,et al.  Evaluation of adaptive neural network models for freeway incident detection , 2002, IEEE Transactions on Intelligent Transportation Systems.

[46]  Isam Kaysi,et al.  Multivariate count data models for adoption of new transport modes in an organization-based context , 2020 .

[47]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[48]  Preeti R. Bajaj,et al.  Short term traffic flow prediction based on neuro-fuzzy hybrid sytem , 2016, 2016 International Conference on ICT in Business Industry & Government (ICTBIG).

[49]  Ramayya Krishnan,et al.  Adaptive collective routing using gaussian process dynamic congestion models , 2013, KDD.

[50]  Philipp Richter,et al.  Revisiting Gaussian Process Regression Modeling for Localization in Wireless Sensor Networks , 2015, Sensors.

[51]  Jinhua Zhao,et al.  Multitask Learning Deep Neural Network to Combine Revealed and Stated Preference Data , 2019, Journal of Choice Modelling.

[52]  Kenneth Train,et al.  Mixed logit with a flexible mixing distribution , 2016 .

[53]  Ella Bingham Reinforcement learning in neurofuzzy traffic signal control , 2001, Eur. J. Oper. Res..

[54]  Michel Bierlaire,et al.  Acceptance of modal innovation: the case of the SwissMetro , 2001 .

[55]  Uneb Gazder,et al.  A new logit‐artificial neural network ensemble for mode choice modeling: a case study for border transport , 2015 .

[56]  Baher Abdulhai,et al.  Reinforcement learning for true adaptive traffic signal control , 2003 .

[57]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[58]  Alexandre Alahi,et al.  Enhancing discrete choice models with representation learning , 2020, Transportation Research Part B: Methodological.

[59]  David A. Hensher,et al.  A comparison of the predictive potential of artificial neural networks and nested logit models for commuter mode choice , 1997 .

[60]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[61]  Kevin Heaslip,et al.  Inferring transportation modes from GPS trajectories using a convolutional neural network , 2018, ArXiv.

[62]  Robert L. Hicks,et al.  Combining Discrete and Continuous Representations of Preference Heterogeneity: A Latent Class Approach , 2010 .

[63]  Akshay Vij,et al.  Incorporating the influence of latent modal preferences on travel mode choice behavior , 2013 .

[64]  Miguel A. Labrador,et al.  Automating mode detection for travel behaviour analysis by using global positioning systemsenabled mobile phones and neural networks , 2010 .

[65]  Chi Xie,et al.  WORK TRAVEL MODE CHOICE MODELING USING DATA MINING: DECISION TREES AND NEURAL NETWORKS , 2002 .

[66]  Wojciech Samek,et al.  Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[67]  Rico Krueger,et al.  Random taste heterogeneity in discrete choice models: Flexible nonparametric finite mixture distributions , 2017 .

[68]  Tarek Sayed,et al.  Comparison of Neural and Conventional Approaches to Mode Choice Analysis , 2000 .

[69]  M. Abou-Zeid,et al.  Modeling the demand for a shared-ride taxi service: An application to an organization-based context , 2016 .

[70]  Iain Murray,et al.  Introduction to Gaussian Processes , 2008 .

[71]  Peter Nijkamp,et al.  Modelling inter-urban transport flows in Italy: A comparison between neural network analysis and logit analysis , 1996 .

[72]  Ole Winther,et al.  Gaussian Processes for Classification: Mean-Field Algorithms , 2000, Neural Computation.

[73]  David Mackay,et al.  Gaussian Processes - A Replacement for Supervised Neural Networks? , 1997 .

[74]  Hesham A. Rakha,et al.  Applying Machine Learning Techniques to Transportation Mode Recognition Using Mobile Phone Sensor Data , 2015, IEEE Transactions on Intelligent Transportation Systems.

[75]  Daniele Gammelli,et al.  Generalized Multi-Output Gaussian Process Censored Regression , 2020, ArXiv.

[76]  Florian Heiss,et al.  Discrete Choice Methods with Simulation , 2016 .

[77]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[78]  Isam Kaysi,et al.  Semi-nonparametric Latent Class Choice Model with a Flexible Class Membership Component: A Mixture Model Approach , 2020, ArXiv.

[79]  Joshua B. Tenenbaum,et al.  Automatic Construction and Natural-Language Description of Nonparametric Regression Models , 2014, AAAI.

[80]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[81]  Feras El Zarwi,et al.  A discrete choice framework for modeling and forecasting the adoption and diffusion of new transportation services , 2017, 1707.07379.

[82]  Thorsten Gerber,et al.  Handbook Of Mathematical Functions , 2016 .

[83]  Francisco C. Pereira,et al.  Heteroscedastic Gaussian processes for uncertainty modeling in large-scale crowdsourced traffic data , 2018, Transportation Research Part C: Emerging Technologies.

[84]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[85]  Dongwoo Lee,et al.  Comparison of Four Types of Artificial Neural Network and a Multinomial Logit Model for Travel Mode Choice Modeling , 2018, Transportation Research Record: Journal of the Transportation Research Board.

[86]  Virginie Lurkin,et al.  Enhancing Discrete Choice Models with Neural Networks , 2018 .

[87]  Feras El Zarwi Modeling and Forecasting the Impact of Major Technological and Infrastructural Changes on Travel Demand , 2017 .

[88]  Chandra R. Bhat,et al.  An Endogenous Segmentation Mode Choice Model with an Application to Intercity Travel , 1997, Transp. Sci..

[89]  A. P. Dawid,et al.  Regression and Classification Using Gaussian Process Priors , 2009 .