Stacked Hybrid Discrete Choice Models for Airline Itinerary Choice

This study develops a methodology to train and apply a hybrid stacked discrete choice model for airline itinerary choice. This stacked model framework includes a data-driven component (i.e., gradient boosting machines) as well as a theory-driven component (i.e., utility maximization using generalized extreme value models). The resulting ensemble model combines attractive features of each, including the ability to conform to complex nonlinear relationships among itinerary characteristics, as well as the ability to leverage an analyst’s understanding of travel behavior tendencies and the natural relationship among itineraries. Using a real industry dataset containing purchase information for approximately 10 million air travelers, it is demonstrated that the resulting model outperforms either the gradient boosting or utility maximization modeling paradigm alone in forecasting air traveler choice behavior. Implementation of this model can be achieved using efficient open source tools including XGBoost and Larch, and requires relatively modest additional effort by an analyst above and beyond the effort to use either tool alone.

[1]  Rodrigo Acuna-Agost,et al.  Data-driven models for itinerary preferences of air travelers and application for dynamic pricing optimization , 2017 .

[2]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[3]  Emmanuel Carrier Modeling airline passenger choice : passenger preference for schedule in the passenger origin-destination simulator (PODS) , 2003 .

[4]  Alejandro Mottini,et al.  Deep Choice Model Using Pointer Networks for Airline Itinerary Prediction , 2017, KDD.

[5]  Michael Schyns,et al.  Modeling competition among airline itineraries , 2018, Transportation Research Part A: Policy and Practice.

[6]  Frank S. Koppelman,et al.  Modeling the competition among air-travel itinerary shares: GEV model development , 2005 .

[7]  Virginie Lurkin,et al.  Accounting for Price Endogeneity in Airline Itinerary Choice Models: An Application to Continental U.S. Markets , 2016 .

[8]  Rodrigo Acuna-Agost,et al.  Airline itinerary choice modeling using machine learning , 2019, Journal of Choice Modelling.

[9]  Virginie Lurkin,et al.  Accounting for Price Endogeneity in Airline Itinerary Choice Models: An Application to Continental U.S. Markets , 2016 .

[10]  R. Wolfinger,et al.  Stacked Ensemble Models for Improved Prediction Accuracy , 2017 .

[11]  K. Small A Discrete Choice Model for Ordered Alternatives , 1987 .

[12]  Cheng-Lung Wu,et al.  Modelling air carrier choices with a Segment Specific Cross Nested Logit model , 2013 .

[13]  Catherine L. Ross,et al.  Machine Learning Travel Mode Choices: Comparing the Performance of an Extreme Gradient Boosting Model with a Multinomial Logit Model , 2018 .

[14]  Maria A. Zuluaga,et al.  Understanding Customer Choices to Improve Recommendations in the Air Travel Industry , 2018, RecTour@RecSys.

[15]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[16]  Virginie Lurkin,et al.  Let Me Not Lie: Learning MultiNomial Logit , 2018, ArXiv.

[17]  Virginie Lurkin,et al.  Computational methods for estimating multinomial, nested, and cross-nested logit models that account for semi-aggregate data , 2018 .

[18]  Akshay Vij,et al.  Machine Learning Meets Microeconomics: The Case of Decision Trees and Discrete Choice , 2017, 1711.04826.

[19]  Frank S. Koppelman,et al.  Modeling aggregate air-travel itinerary shares: logit model development at a major US airline , 2003 .