A stepwise interpretable machine learning framework using linear regression (LR) and long short-term memory (LSTM): City-wide demand-side prediction of yellow taxi and for-hire vehicle (FHV) service

Abstract As app-based ride-hailing services have been widely adopted within existing traditional taxi markets, researchers have been devoted to understand the important factors that influence the demand of the new mobility. Econometric models (EMs) are mainly utilized to interpret the significant factors of the demand, and deep neural networks (DNNs) have been recently used to improve the forecasting performance by capturing complex patterns in the large datasets. However, to mitigate possible (induced) traffic congestion and balance utilization rates for the current taxi drivers, an effective strategy of proactively managing a quota system for both emerging services and regular taxis is still critically needed. This paper aims to systematically design an explainable deep learning model capable of assessing the quota system balancing the demand volumes between two modes. A two-stage interpretable machine learning modeling framework was developed by a linear regression (LR) model, coupled with a neural network layered by long short-term memory (LSTM). The first stage investigates the correlation between the existing taxis and on-demand ride-hailing services while controlling for other explanatory variables. The second stage fulfills the long short-term memory (LSTM) network structure, capturing the residuals from the first estimation stage in order to enhance the forecasting performance. The proposed stepwise modeling approach (LR-LSTM) forecasts the demand of taxi rides, and it is implemented in the application of pick-up demand prediction using New York City (NYC) taxi data. The experiment result indicates that the integrated model can capture the inter-relationships between existing taxis and ride-hailing services as well as identify the influence of additional factors, namely, the day of the week, weather, and holidays. Overall, this modeling approach can be applied to construct an effective active demand management (ADM) for the short-term period as well as a quota control strategy between on-demand ride-hailing services and traditional taxis.

[1]  Yulin Liu,et al.  Dispatch of autonomous vehicles for taxi services: A deep reinforcement learning approach , 2020, Transportation Research Part C: Emerging Technologies.

[2]  Mukta Paliwal,et al.  Neural networks and statistical techniques: A review of applications , 2009, Expert Syst. Appl..

[3]  Basheer M. Al-Maqaleh,et al.  Forecasting using Artificial Neural Network and Statistics Models , 2016 .

[4]  Feng Chen,et al.  From Twitter to detector: real-time traffic incident detection using social media data , 2016 .

[5]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[6]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[7]  Abhishek Singhal,et al.  Analysis of taxi demand and supply in New York City: implications of recent taxi regulations , 2015 .

[8]  Xiqun Chen,et al.  Short-Term Forecasting of Passenger Demand under On-Demand Ride Services: A Spatio-Temporal Deep Learning Approach , 2017, ArXiv.

[9]  Xin Wu,et al.  Hierarchical travel demand estimation using multiple data sources: A forward and backward propagation algorithmic framework on a layered computational graph , 2018, Transportation Research Part C: Emerging Technologies.

[10]  Samiul Hasan,et al.  Identifying tourists and analyzing spatial patterns of their destinations from location-based social media data , 2018, Transportation Research Part C: Emerging Technologies.

[11]  Tsvi Kuflik,et al.  Automating a framework to extract and analyse transport related social media content: The potential and the challenges , 2017 .

[12]  Shanjiang Zhu,et al.  Potentials of using social media to infer the longitudinal travel behavior: A sequential model-based clustering method , 2017 .

[13]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[14]  Eric J. Gonzales,et al.  Modeling Taxi Trip Demand by Time of Day in New York City , 2014 .

[15]  Zachary C. Lipton,et al.  The mythos of model interpretability , 2018, Commun. ACM.

[16]  Qian Zhu,et al.  Analyzing the Impact of Traffic Congestion Mitigation: From an Explainable Neural Network Learning Framework to Marginal Effect Analyses , 2019, Sensors.

[17]  James Kuhr,et al.  A Model of Ridesourcing Demand Generation and Distribution , 2018 .

[18]  Hyoshin Park,et al.  Interpretation of Bayesian neural networks for predicting the duration of detected incidents , 2016, J. Intell. Transp. Syst..

[19]  Barak A. Pearlmutter,et al.  Automatic differentiation in machine learning: a survey , 2015, J. Mach. Learn. Res..

[20]  Chao Mao,et al.  Optimization models for electric vehicle service operations: A literature review , 2019, Transportation Research Part B: Methodological.

[21]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[22]  Nima Golshani,et al.  Modeling travel mode and timing decisions: Comparison of artificial neural networks and copula-based joint model , 2018 .

[23]  Edward I. Altman,et al.  Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience) , 1994 .

[24]  Ren Wang,et al.  Efficient multiple model particle filtering for joint traffic state estimation and incident detection , 2016 .

[25]  Yu Cui,et al.  Forecasting current and next trip purpose with social media data and Google Places , 2018, Transportation Research Part C: Emerging Technologies.

[26]  Zuo-Jun Max Shen,et al.  Modeling taxi services with smartphone-based e-hailing applications , 2015 .

[27]  Xiqun Chen,et al.  Understanding ridesplitting behavior of on-demand ride services: An ensemble learning approach , 2017 .

[28]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[29]  Robert Gould,et al.  A Modern Approach to Regression with R , 2010 .

[30]  Rajesh Kumar,et al.  Comparison of regression and artificial neural network models for estimation of global solar radiations , 2015 .

[31]  Karthik C. Konduri,et al.  Is There a Limit to Adoption of Dynamic Ridesharing Systems? Evidence from Analysis of Uber Demand Data from New York City , 2018, Transportation Research Record: Journal of the Transportation Research Board.

[32]  Jinzhou Cao,et al.  Extracting Trips from Multi-Sourced Data for Mobility Pattern Analysis: An App-Based Data Example. , 2019, Transportation research. Part C, Emerging technologies.

[33]  Eleni I. Vlahogianni,et al.  Statistical methods versus neural networks in transportation research: Differences, similarities and some insights , 2011 .