Using machine learning and big data for efficient forecasting of hotel booking cancellations

Abstract Cancellations are a key aspect of hotel revenue management because of their impact on room reservation systems. In fact, very little is known about the reasons that lead customers to cancel, or how it can be avoided. The aim of this paper is to propose a means of enabling the forecasting of hotel booking cancellations using only 13 independent variables, a reduced number in comparison with related research in the area, which in addition coincide with those that are most often requested by customers when they place a reservation. For this matter, machine-learning techniques, among other artificial neural networks optimised with genetic algorithms were applied achieving a cancellation rate of up to 98%. The proposed methodology allows us not only to know about cancellation rates, but also to identify which customer is likely to cancel. This approach would mean organisations could strengthen their action protocols regarding tourist arrivals.

[1]  L. Moutinho,et al.  Modeling and forecasting tourism demand: the case of flows from Mainland China to Taiwan , 2008 .

[2]  George Athanasopoulos,et al.  Forecasting: principles and practice , 2013 .

[3]  Ana de Almeida,et al.  Predictive models for hotel booking cancellation: a semi-automated analysis of the literature , 2019, Tourism & Management Studies.

[4]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[5]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[6]  Samuel E. Bodily,et al.  A test of space-time arma modelling and forecasting of hotel data , 1990 .

[7]  Ramli Nazir,et al.  Prediction of pile bearing capacity using a hybrid genetic algorithm-based ANN , 2014 .

[8]  Andreas H. Zins,et al.  Acceptance of Online vs. Traditional Travel Agencies , 2009 .

[9]  Ulrike Gretzel,et al.  Measuring Web Site Quality for Online Travel Agencies , 2007 .

[10]  Douglas C. Frechtling,et al.  Forecasting Tourism Demand: Methods and Strategies , 2001 .

[11]  Robert D. van der Mei,et al.  Revenue management under customer choice behaviour with cancellations and overbooking , 2015, Eur. J. Oper. Res..

[12]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[13]  John Mingers,et al.  An Empirical Comparison of Selection Measures for Decision-Tree Induction , 1989, Machine Learning.

[14]  Bing Pan,et al.  Forecasting Destination Weekly Hotel Occupancy with Big Data , 2017 .

[15]  Ger Koole,et al.  Booking horizon forecasting with dynamic updating: A case study of hotel reservation data , 2011 .

[16]  Sjoerd Gehrels,et al.  How economic crisis affects revenue management: the case of the Prague Hilton hotels , 2013 .

[17]  Rutvija Pandya,et al.  C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning , 2015 .

[18]  G. Prabhakaran,et al.  GA-Driven ANN Model for Worker Assignment into Virtual Manufacturing Cells , 2010 .

[19]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[20]  José Augusto Baranauskas,et al.  How Many Trees in a Random Forest? , 2012, MLDM.

[21]  Haiyan Song,et al.  A review of research on tourism demand forecasting: Launching the Annals of Tourism Research Curated Collection on tourism demand forecasting , 2019, Annals of Tourism Research.

[22]  Misuk Lee,et al.  Modeling and forecasting hotel room demand based on advance booking information , 2018, Tourism Management.

[23]  Noel Healy World Tourism Organization , 2011, Permanent Missions to the United Nations No.301.

[24]  Douglas E. Sturim,et al.  Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[25]  João Paulo Teixeira,et al.  Tourism Time Series Forecast -Different ANN Architectures with Time Index Input , 2012, Procedia Technology.

[26]  Anongnart Srivihok,et al.  Comparisons of classifier algorithms: Bayesian network, C4.5, decision forest and NBTree for Course Registration Planning model of undergraduate students , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[27]  Mounir Ben Ghalia,et al.  Forecasting uncertain hotel room demand , 2001, Inf. Sci..

[28]  Lior Rokach,et al.  Data Mining with Decision Trees - Theory and Applications , 2007, Series in Machine Perception and Artificial Intelligence.

[29]  Ana de Almeida,et al.  Predicting Hotel Bookings Cancellation with a Machine Learning Classification Model , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[30]  D. Altman,et al.  Statistics Notes: Diagnostic tests 1: sensitivity and specificity , 1994 .

[31]  A. Y. Chang,et al.  Using Artificial Neural Networks to Establish a Customer-cancellation Prediction Model , 2013 .

[32]  J. Brida,et al.  A non-linear approximation to the distribution of total expenditure distribution of cruise tourists in Uruguay , 2018, Tourism Management.

[33]  Enric Monte,et al.  Tourism demand forecasting with neural network models: different ways of treating information. , 2015 .

[34]  Luis Nobre Pereira,et al.  An introduction to helpful forecasting methods for hotel revenue management , 2016 .

[35]  James W. Mjelde,et al.  The forecasting of International Expo tourism using quantitative and qualitative techniques , 2008, Tourism Management.

[36]  N. Antonio,et al.  Big Data in Hotel Revenue Management: Exploring Cancellation Drivers to Gain Insights Into Booking Cancellation Behavior , 2019, Cornell Hospitality Quarterly.

[37]  Sheryl E. Kimes,et al.  A comparison of forecasting methods for hotel revenue management , 2003 .

[38]  Carmen Escanciano,et al.  Benefits of the ISO 9000:1994 system: Some considerations to reinforce competitive advantage , 2002 .

[39]  Nikolaos Kourentzes,et al.  Demand forecasting by temporal aggregation: Using optimal or multiple aggregation levels? , 2017 .

[40]  Oscar Claveria,et al.  Forecasting tourism demand using consumer expectations , 2010 .

[41]  Sedat Yüksel An integrated forecasting approach to hotel demand , 2007, Math. Comput. Model..

[42]  Thomas O. Gorin,et al.  No-show forecasting: A blended cost-based, PNR-adjusted approach , 2006 .

[43]  B. Faulkner,et al.  An Integrative Approach to Tourism Forecasting: A Glance in the Rearview Mirror , 2001 .

[44]  Zhiye Zhao,et al.  Design of structural modular neural networks with genetic algorithm , 2003 .

[45]  Tao Chen,et al.  Effective tourist volume forecasting supported by PCA and improved BPNN using Baidu index , 2018, Tourism Management.

[46]  Zvi Schwartz,et al.  Forecasting Short Time-Series Tourism Demand with Artificial Intelligence Models , 2006 .

[47]  Dolores Romero Morales,et al.  Forecasting cancellation rates for services booking revenue management using data mining , 2010, Eur. J. Oper. Res..

[48]  Sonajharia Minz,et al.  Rough Set Based Decision Tree Model for Classification , 2003, DaWaK.

[49]  S. Dolnicar,et al.  Preventing tourists from canceling in times of crises , 2016 .

[50]  Xiaofeng Shi,et al.  Tourism culture and demand forecasting based on BP neural network mining algorithms , 2019, Personal and Ubiquitous Computing.

[51]  Luiz Moutinho,et al.  Forecasting the Tourism Environment Using a Consensus Approach , 1995 .

[52]  Haiyan Song,et al.  A meta-analysis of international tourism demand forecasting and implications for practice , 2014 .

[53]  Sedat Yüksel AN INTEGRATED FORECASTING APPROACH FOR HOTELS , 2005 .

[54]  Kemal Polat,et al.  Multi-class f-score feature selection approach to classification of obstructive sleep apnea syndrome , 2010, Expert Syst. Appl..

[55]  Emmanuel Sirimal Silva,et al.  Forecasting Accuracy Evaluation of Tourist Arrivals: Evidence from Parametric and Non-Parametric Techniques , 2015 .

[56]  Swagato Chatterjee,et al.  Drivers of helpfulness of online hotel reviews: A sentiment and emotion mining approach , 2020 .

[57]  Vera Shanshan Lin,et al.  A review of Delphi forecasting research in tourism , 2015 .

[58]  Haiyan Song,et al.  Tourism demand modelling and forecasting—A review of recent research , 2008 .

[59]  V. Cho A comparison of three different approaches to tourist arrival forecasting , 2003 .

[60]  S. Mitra,et al.  Do airbnb host listing attributes influence room pricing homogenously? , 2019, International Journal of Hospitality Management.

[61]  Mohsen Nasseri,et al.  Optimized scenario for rainfall forecasting using genetic algorithm coupled with artificial neural network , 2008, Expert Syst. Appl..

[62]  Shyi-Ming Chen,et al.  Handling forecasting problems using fuzzy time series , 1998, Fuzzy Sets Syst..

[63]  Fong-Lin Chu,et al.  Forecasting tourism demand with ARMA-based methods. , 2009 .

[64]  Nilima P. Patil,et al.  Comparison of C5.0 & CART Classification algorithms using pruning technique , 2012 .

[65]  Wonho Song,et al.  Short-term forecasting of Japanese tourist inflow to South Korea using Google trends data , 2017 .

[66]  Z. Schwartz,et al.  On revenue management and the use of occupancy forecasting error measures , 2014 .

[67]  John L. Crompton,et al.  An Overview of Approaches Used to Forecast Tourism Demand , 1985 .

[68]  Ying-Jen Chen,et al.  Bayesian inference for mining semiconductor manufacturing big data for yield enhancement and smart production to empower industry 4.0 , 2017, Appl. Soft Comput..

[69]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[70]  Jasmina Arifovic,et al.  Using genetic algorithms to select architecture of a feedforward artificial neural network , 2001 .

[71]  Gang Li,et al.  Forecasting tourist arrivals using time-varying parameter structural time series models , 2011 .

[72]  Nuno Antonio,et al.  Predicting hotel booking cancellations to decrease uncertainty and increase revenue , 2017 .

[73]  Peng Jiang,et al.  Forecasting tourism demand by incorporating neural networks into Grey–Markov models , 2019, J. Oper. Res. Soc..

[74]  John Mingers,et al.  An Empirical Comparison of Pruning Methods for Decision Tree Induction , 1989, Machine Learning.

[75]  S. Agatonovic-Kustrin,et al.  Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. , 2000, Journal of pharmaceutical and biomedical analysis.

[76]  Cindy Yoonjoung Heo,et al.  Application of revenue management practices to the theme park industry. , 2009 .

[77]  Zvi Schwartz,et al.  The search for the best deal: How hotel cancellation policies affect the search and booking decisions of deal-seeking customers , 2011 .

[78]  Nikolaos Stylos,et al.  ‘You will like it!’ using open data to predict tourists' response to a tourist attraction , 2017 .

[79]  Measurement of Tourism Market Potential of Croatia by Use of Delphi Qualitative Research Technique , 2007 .

[80]  L. Moutinho,et al.  An Advanced Approach to Forecasting Tourism Demand in Taiwan , 2007 .

[81]  Han-Chen Huang,et al.  Tourism Demand Forecasting Model Using Neural Network , 2017 .

[82]  Y. Poon,et al.  Analyzing the Use of an Advance Booking Curve in Forecasting Hotel Reservations , 2015 .

[83]  Jan Olhager,et al.  Supply chain evolution – theory, concepts and science , 2016 .

[84]  María del Rocío Martínez-Torres,et al.  A machine learning approach for the identification of the deceptive reviews in the hospitality sector using unique attributes and sentiment orientation , 2019, Tourism Management.

[85]  S. Pratt,et al.  Predicting hotel occupancies with public data , 2017 .

[86]  Athanasius Zakhary,et al.  Forecasting hotel arrivals and occupancy using Monte Carlo simulation , 2011 .

[87]  Si Wu,et al.  Improving support vector machine classifiers by modifying kernel functions , 1999, Neural Networks.

[88]  Irem Önder,et al.  Forecasting international city tourism demand for Paris: Accuracy of uni- and multivariate models employing monthly data , 2015 .

[89]  Carles Mateu,et al.  Modelling a grading scheme for peer-to-peer accommodation: Stars for Airbnb , 2018 .

[90]  Martin Falk,et al.  Modelling the cancellation behaviour of hotel guests , 2018, International Journal of Contemporary Hospitality Management.

[91]  Hyewon Youn,et al.  Predicting Korean lodging firm failures: An artificial neural network model along with a logistic regression model , 2010 .

[92]  Haiyan Song,et al.  New developments in tourism and hotel demand modeling and forecasting , 2017 .

[93]  C. Witt,et al.  Forecasting tourism demand: A review of empirical research , 1995 .

[94]  Z. Schwartz,et al.  Subjective Estimates of Occupancy Forecast Uncertainty by Hotel Revenue Managers , 2004 .

[95]  Kuo-Ching Wang,et al.  DEVELOPING A FORECAST SYSTEM FOR HOTEL OCCUPANCY RATE USING INTEGRATED ARIMA MODELS , 1998 .

[96]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics, ProbabilityTheory Group (Formerly: E1071), TU Wien , 2015 .

[97]  Yang Yang,et al.  Exploring the impact of personalized management responses on tourists’ satisfaction: A topic matching perspective , 2020 .