Forecasting cancellation rates for services booking revenue management using data mining

Revenue management (RM) enhances the revenues of a company by means of demand-management decisions. An RM system must take into account the possibility that a booking may be canceled, or that a booked customer may fail to show up at the time of service (no-show). We review the Passenger Name Record data mining based cancellation rate forecasting models proposed in the literature, which mainly address the no-show case. Using a real-world dataset, we illustrate how the set of relevant variables to describe cancellation behavior is very different in different stages of the booking horizon, which not only confirms the dynamic aspect of this problem, but will also help revenue managers better understand the drivers of cancellation. Finally, we examine the performance of the state-of-the-art data mining methods when applied to Passenger Name Record based cancellation rate forecasting.

[1]  Edwin P. D. Pednault,et al.  A probabilistic estimation framework for predictive modeling analytics , 2002, IBM Syst. J..

[2]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[3]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[4]  Ji Zhu,et al.  Kernel Logistic Regression and the Import Vector Machine , 2001, NIPS.

[5]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[6]  Mounir Ben Ghalia,et al.  Forecasting uncertain hotel room demand , 2001, Inf. Sci..

[7]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[8]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[11]  Christoph Hueglin,et al.  Data mining techniques to improve forecast accuracy in airline business , 2001, KDD '01.

[12]  Richard D. Lawrence,et al.  Passenger-based predictive modeling of airline no-show rates , 2003, KDD '03.

[13]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[16]  Rodney D. Nielsen MOB-ESP and other Improvements in Probability Estimation , 2004, UAI.

[17]  W. Lieberman The Theory and Practice of Revenue Management , 2005 .

[18]  Thomas O. Gorin,et al.  No-show forecasting: A blended cost-based, PNR-adjusted approach , 2006 .

[19]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[20]  B. Freisleben,et al.  Controlling airline seat allocations with neural networks , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[21]  Ramesh Natarajan,et al.  Ensemble modeling through multiplicative adjustment of class probability , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[22]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[23]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[24]  G. Wahba Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV , 1999 .

[25]  Frank C. Lin,et al.  Forecasting airline seat show rates with neural networks , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[26]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[27]  S. Sathiya Keerthi,et al.  A Fast Dual Algorithm for Kernel Logistic Regression , 2002, 2007 International Joint Conference on Neural Networks.

[28]  Marvin Rothstein,et al.  OR Forum - OR and the Airline Overbooking Problem , 1985, Oper. Res..

[29]  Dan C. Iliescu,et al.  Hazard Model of U.S. Airline Passengers’ Refund and Exchange Behavior , 2008 .

[30]  Kristof Coussement,et al.  A probability-mapping algorithm for calibrating the posterior probabilities: A direct marketing application , 2011, Eur. J. Oper. Res..

[31]  Maurice Bruynooghe,et al.  A Comparison of Approaches for Learning Probability Trees , 2005, ECML.

[32]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[33]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[34]  Silvia Riedel,et al.  New approaches to origin and destination and no-show forecasting: Excavating the passenger name records treasure , 2004 .

[35]  T. Davenport Competing on analytics. , 2006, Harvard business review.

[36]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[37]  Rich Caruana,et al.  Data mining in metric space: an empirical analysis of supervised learning performance criteria , 2004, ROCAI.

[38]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[39]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[40]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .