Analyzing high speed rail passengers’ train choices based on new online booking data in China

Abstract This study explores two nonparametric machine learning methods, namely support vector regression (SVR) and artificial neural networks (ANN), for understanding and predicting high-speed rail (HSR) travelers’ choices of ticket purchase timings, train types, and travel classes, using ticket sales data. In the train choice literature, discrete choice analysis is the predominant approach and many variants of logit models have been developed. Alternatively, emerging travel choice studies adopt non-utility-based methods, especially nonparametric machine learning methods including SVR and ANN, because (1) those methods do not rely on assumptions on the relations between choices and explanatory variables or any prior knowledge of the underlying relations; (2) they have superb capabilities of iteratively identifying patterns and extracting rules from data. This paper thus contributes to the HSR train choice literature by applying and comparing SVR and ANN with a real-world case study of the Shanghai-Beijing HSR market in China. A new normalized metric capturing both the load factor and the booking lead time is proposed as the target variable and several train service attributes, such as day of week, departure time, travel time, fare, are identified as input variables. Computational results demonstrate that both SVR and ANN can predict the train choice behavior with high accuracy, outperforming the linear regression approach. Potential applications of this study, such as rail pricing reform, have also been identified.

[1]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[3]  Heaton T. Jeff,et al.  Introduction to Neural Networks with Java , 2005 .

[4]  Ricardo García-Ródenas,et al.  High-speed railway scheduling based on user preferences , 2015, Eur. J. Oper. Res..

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  J. Hunt,et al.  Valuing of attributes influencing the attractiveness of suburban train service in Mumbai city: A stated preference approach , 2012 .

[7]  Richard Ratliff,et al.  Estimating Primary Demand for Substitutable Products from Sales Transaction Data , 2011, Oper. Res..

[8]  S. Kimes Yield management: A tool for capacity-considered service firms , 1989 .

[9]  Peter P. Belobaba,et al.  Survey Paper - Airline Yield Management An Overview of Seat Inventory Control , 1987, Transp. Sci..

[10]  Yunlong Zhang,et al.  Travel Mode Choice Modeling with Support Vector Machines , 2008 .

[11]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[12]  Buyue Qian,et al.  Improving rail network velocity: A machine learning approach to predictive maintenance , 2014 .

[13]  Demetris Stathakis,et al.  How many hidden layers and nodes? , 2009 .

[14]  Jeffrey I. McGill,et al.  Censored regression analysis of multiclass passenger demand data subject to joint capacity constraints , 1995, Ann. Oper. Res..

[15]  Cinzia Cirillo,et al.  Accommodating taste heterogeneity in railway passenger choice models based on internet booking data , 2013 .

[16]  Larry Weatherford,et al.  Better unconstraining of airline demand data in revenue management systems for improved forecast accuracy and greater revenues , 2002 .

[17]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[18]  Barry C. Smith,et al.  Yield Management at American Airlines , 1992 .

[19]  E R Kraft,et al.  REVENUE MANAGEMENT IN RAILROAD APPLICATIONS , 2000 .

[20]  R. Ratliff,et al.  A multi-flight recapture heuristic for estimating unconstrained demand from airline bookings , 2008 .

[21]  Cinzia Cirillo,et al.  Discrete choice model for Amtrak Acela Express revenue management , 2011 .

[22]  Paul Schonfeld,et al.  Analyzing passenger train arrival delays with support vector regression , 2015 .

[23]  Lingyun Meng,et al.  Seat inventory control methods for Chinese passenger railways , 2014 .

[24]  Jeffrey I. McGill,et al.  Revenue Management: Research Overview and Prospects , 1999, Transp. Sci..

[25]  Nadir Yayla,et al.  The modeling of mode choices of intercity freight transportation with the artificial neural networks and adaptive neuro-fuzzy inference system , 2009, Expert Syst. Appl..

[26]  Garrett J. van Ryzin,et al.  OM Practice - Choice-Based Revenue Management: An Empirical Study of Estimation and Optimization , 2010, Manuf. Serv. Oper. Manag..

[27]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[28]  Cinzia Cirillo,et al.  A latent class choice based model system for railway optimal pricing and seat allocation , 2014 .