Fine-grained tourism prediction: Impact of social and environmental features

Abstract Accurate predictions about future events is essential in many areas, one of them being the Tourism Industry. Usually, cities and countries invest a huge amount of money for planning and preparation in order to welcome (and profit from) tourists. The success of many businesses depends largely or totally on the state of tourism demand. Estimation of tourism demand can be helpful to business planners in reducing the risk of decisions regarding the future since tourism products are, generally speaking, perishable (gone if not used). Prior studies in this domain focus on forecasting for a whole country and not for fine-grained areas within a country (e.g., specific touristic attractions) mainly because of lack of data. Our article tackles exactly this issue. With the rapid popularity growth of social media applications, each year more people interact within online resources to plan and comment on their trips. Motivated by such observation, we here suggest that accessible data in online social networks or travel websites, in addition to environmental data, can be used to support the inference of visitation count for either indoor or outdoor touristic attractions. To test our hypothesis we analyze visitation counts, environmental features and social media data related to 27 museums and galleries in U.K as well as 76 national parks in the U.S. Our experimental results reveal high accuracy levels (above 92%) for predicting tourism demand using features from both social media and environmental data. We also show that, for outdoor attractions, environmental features have better predictive power while the opposite occurs for indoor attractions. In any case, best results, in all scenarios, are obtained when using both types of features jointly. Finally, we perform a detailed failure analysis to inspect the cases in which the prediction results are not satisfactory.

[1]  William W. S. Wei,et al.  Time series analysis - univariate and multivariate methods , 1989 .

[2]  Ilde Rizzo,et al.  Tourism seasonality in cultural destinations: Empirical evidence from Sicily , 2011 .

[3]  Jussara M. Almeida,et al.  Predicting the popularity of micro-reviews: A Foursquare case study , 2015, Inf. Sci..

[4]  Michael G. Madden,et al.  The effect of principal component analysis on machine learning accuracy with high-dimensional spectral data , 2005, Knowl. Based Syst..

[5]  Geoffrey Wall,et al.  Novelty seeking at aboriginal attractions. , 2006 .

[6]  Alain Decrop,et al.  New perspectives on vacation decision making , 2011 .

[7]  Douglas C. Frechtling,et al.  Forecasting tourism demand , 2001 .

[8]  Abdulhamit Subasi,et al.  Developing tourism demand forecasting models using machine learning techniques with trend, seasonal, and cyclic components , 2015 .

[9]  Jaume Rosselló,et al.  Yearly, monthly and weekly seasonality of tourism demand: A decomposition analysis , 2017 .

[10]  Thomas D. Gautheir Detecting Trends Using Spearman's Rank Correlation Coefficient , 2001 .

[11]  Paulo Rita,et al.  Forecasting tomorrow’s tourist , 2016 .

[12]  Kevin K. F. Wong,et al.  Modeling Seasonality in Tourism Forecasting , 2005 .

[13]  Ping-Feng Pai,et al.  Tourism demand forecasting using novel hybrid system , 2014, Expert Syst. Appl..

[14]  Rob Law,et al.  The Dynamics of Search Engine Marketing for Tourist Destinations , 2011 .

[15]  Hendrik,et al.  Trip Guidance: A Linked Data Based Mobile Tourists Guide , 2014 .

[16]  J. Stevens,et al.  Applied Multivariate Statistics for the Social Sciences , 1993 .

[17]  Ma Belén Gómez Martín,et al.  Weather, climate and tourism: a geographical perspective. , 2005 .

[18]  Guanling Chen,et al.  Analysis of a Location-Based Social Network , 2009, 2009 International Conference on Computational Science and Engineering.

[19]  C. Lewis Industrial and business forecasting methods : a practical guide to exponential smoothing and curve fitting , 1982 .

[20]  O. J. Dunn,et al.  Applied statistics: analysis of variance and regression , 1975 .

[21]  Cecilia Mascolo,et al.  Where Businesses Thrive: Predicting the Impact of the Olympic Games on Local Retailers through Location-based Services Data , 2014, ICWSM.

[22]  Wagner Meira,et al.  Understanding temporal aspects in document classification , 2008, WSDM '08.

[23]  Athanasios Koutras,et al.  Forecasting Tourism Demand Using Linear and Nonlinear Prediction Models , 2017 .

[24]  Rob Law,et al.  Identifying emerging hotel preferences using Emerging Pattern Mining technique , 2015 .

[25]  Naren Ramakrishnan,et al.  Wikipedia in the Tourism Industry: Forecasting Demand and Modeling Usage Behavior , 2016, AAAI.

[26]  R. Law,et al.  Progress on information and communication technologies in hospitality and tourism , 2014 .

[27]  R. Fisher 014: On the "Probable Error" of a Coefficient of Correlation Deduced from a Small Sample. , 1921 .

[28]  A. Webber,et al.  Exchange Rate Volatility and Cointegration in Tourism Demand , 2001 .

[29]  A. Guerry,et al.  Using social media to quantify nature-based tourism and recreation , 2013, Scientific Reports.

[30]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[31]  Giustina Secundo,et al.  Creating value from Social Big Data: Implications for Smart Tourism Destinations , 2017, Inf. Process. Manag..

[32]  Jussara M. Almeida,et al.  FISETIO: A FIne-grained, Structured and Enriched Tourism Dataset for Indoor and Outdoor attractions , 2020, Data in brief.

[33]  Dimitrios Buhalis,et al.  Forecasting tourist arrivals at attractions: Search engine empowered methodologies , 2018, Tourism Economics.

[34]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[35]  Christopher Leckie,et al.  Personalized trip recommendation for tourists based on user interests, points of interest visit durations and visit recency , 2018, Knowledge and Information Systems.

[36]  Nicholas A. Fisichelli,et al.  Protected Area Tourism in a Changing Climate: Will Visitation at US National Parks Warm Up or Overheat? , 2015, PloS one.

[37]  Chang Jui Lin,et al.  Forecasting Tourism Demand Using Time Series, Artificial Neural Networks and Multivariate Adaptive Regression Splines:Evidence from Taiwan , 2011 .

[38]  J. Ritchie,et al.  Tourism: Principles, Practices, Philosophies , 1990 .

[39]  J. Brownstein,et al.  Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. , 2012, The American journal of tropical medicine and hygiene.

[40]  Wagner Meira,et al.  A Two-Stage Machine learning approach for temporally-robust text classification , 2017, Inf. Syst..

[41]  Satish V. Ukkusuri,et al.  Understanding urban human activity and mobility patterns using large-scale location-based data from online social media , 2013, UrbComp '13.

[42]  Zhang Jie,et al.  Neural Network Ensemble for Chinese Inbound Tourism Demand Prediction , 2011 .

[43]  Xiankai Huang,et al.  The Baidu Index: Uses in predicting tourism flows –A case study of the Forbidden City , 2017 .

[44]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[45]  Sangwon Park,et al.  What makes a useful online review? Implication for travel product websites. , 2015 .

[46]  Li Li,et al.  A Dynamic Panel Data Analysis of Climate and Tourism Demand , 2017 .

[47]  Jens K. Steenjacobsen Nomadic Tourism and Fleeting Place Encounters: Exploring Different Aspects of Sightseeing , 2001 .

[48]  Maria Lexhagen,et al.  Google Trends data for analysing tourists’ online search behaviour and improving demand forecasting: the case of Åre, Sweden , 2018, Information Technology & Tourism.

[49]  Nicole Koenig-Lewis,et al.  Seasonality research: the state of the art , 2005 .

[50]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[51]  K. J. White,et al.  The Durbin-Watson Test for Autocorrelation in Nonlinear Models , 1992 .

[52]  L. Xiaoxuan,et al.  Tourism forecasting by search engine data with noise-processing , 2016 .

[53]  Franco Zambonelli,et al.  Extracting urban patterns from location-based social networks , 2011, LBSN '11.

[54]  Fionn Murtagh,et al.  Multilayer perceptrons for classification and regression , 1991, Neurocomputing.

[55]  Milton S. Boyd,et al.  Designing a neural network for forecasting financial and economic time series , 1996, Neurocomputing.

[56]  Zissis Maditinos,et al.  Crises and Disasters in Tourism Industry: Happen locally - Affect globally , 2008 .

[57]  Haiyan Song,et al.  Predicting Tourist Demand Using Big Data , 2017 .

[58]  R. O’Brien,et al.  A Caution Regarding Rules of Thumb for Variance Inflation Factors , 2007 .

[59]  Antonio Moreno,et al.  Intelligent tourism recommender systems: A survey , 2014, Expert Syst. Appl..

[60]  Chaohui Wang,et al.  Predicting tourism demand using fuzzy time series and hybrid grey theory. , 2004 .

[61]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[62]  Jussara M. Almeida,et al.  Using early view patterns to predict the popularity of youtube videos , 2013, WSDM.

[63]  Daniel R. Fesenmaier,et al.  ASSESSING ADVERTISING IN A HIERARCHICAL DECISION MODEL , 2013 .

[64]  Mathieu Roche,et al.  The role of location and social strength for friendship prediction in location-based social networks , 2018, Inf. Process. Manag..

[65]  D. G. Herr On the History of ANOVA in Unbalanced, Factorial Designs: The First 30 Years , 1986 .

[66]  E. Pantano,et al.  From e‐tourism to f‐tourism: emerging issues from negative tourists' online reviews , 2013 .

[67]  Gisele L. Pappa,et al.  Temporally-aware algorithms for document classification , 2010, SIGIR '10.

[68]  Donald F. Specht,et al.  A general regression neural network , 1991, IEEE Trans. Neural Networks.

[69]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.