Spurious patterns in Google Trends data - An analysis of the effects on tourism demand forecasting in Germany

Abstract Previous studies show that time series data about the frequency of hits for tourism-related search terms from Google (Google Trends data) is a valuable predictor for short-term tourism demand forecasting in many different tourism regions worldwide. The paper contributes to this literature in three ways. First, it shows that Google Trends data is useful for short-term predictions of monthly tourist arrivals in several German holiday regions. Second, the paper also demonstrates that the Google Trends time series we employ share certain patterns with Google Trends time series used in previous studies, including several studies totally unrelated to the tourism industry. We refer to these artefacts as “spurious patterns” and perform a detailed analysis of their negative impact on forecasting. Last, the paper proposes a method to sanitize Google Trends data and reduce the adverse impact of spurious patterns, thereby paving the way to develop statistically sound tourism demand forecasts.

[1]  Boriss A. Siliverstovs,et al.  Google Trends and reality: Do the proportions match?: Appraising the informational value of online search behavior: Evidence from Swiss tourism regions , 2016 .

[2]  Irem Önder,et al.  Forecasting tourism demand with Google trends: Accuracy comparison of countries versus cities , 2017 .

[3]  Konstantinos Nikolopoulos,et al.  The Tourism Forecasting Competition , 2011 .

[4]  Forecasting tourism arrivals with an online search engine data: A study of the Balearic Islands , 2017 .

[5]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[6]  Klaus F. Zimmermann,et al.  Google Econometrics and Unemployment Forecasting , 2009 .

[7]  de Kort,et al.  Forecasting tourism demand through search queries and machine learning , 2017 .

[8]  Levent Bulut,et al.  Google Trends and the Forecasting Performance of Exchange Rate Models , 2015 .

[9]  Haiyan Song,et al.  Tourism demand modelling and forecasting—A review of recent research , 2008 .

[10]  D. Fesenmaier,et al.  Adapting to the Internet , 2015 .

[11]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[12]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[13]  Torsten Schmidt,et al.  Forecasting private consumption: survey‐based indicators vs. Google trends , 2011 .

[14]  Maximo Camacho,et al.  Forecasting travellers in Spain with Google’s search volume indices , 2018 .

[15]  Xin Yang,et al.  Forecasting Chinese tourist volume with search engine data , 2015 .

[16]  Chris Chatfield,et al.  Time‐series forecasting , 2000 .

[17]  Mauricio Santillana,et al.  ARGO: a model for accurate estimation of influenza epidemics using Google search data , 2015, ArXiv.

[18]  Gentile Francesco Ficetola,et al.  Is interest toward the environment really declining? The complexity of analysing trends using internet search data , 2013, Biodiversity and Conservation.

[19]  Wonho Song,et al.  Short-term forecasting of Japanese tourist inflow to South Korea using Google trends data , 2017 .

[20]  M. Santillana,et al.  What can digital disease detection learn from (an external revision to) Google Flu Trends? , 2014, American journal of preventive medicine.

[21]  E. Brynjolfsson,et al.  The Future of Prediction: How Google Searches Foreshadow Housing Prices and Sales , 2013, ICIS 2013.

[22]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[23]  H. Varian,et al.  Predicting the Present with Google Trends , 2012 .

[24]  Prosper F. Bangwayo-Skeete,et al.  Can Google data improve the forecasting performance of tourist arrivals? Mixed-data sampling approach , 2015 .

[25]  Irem Önder,et al.  Forecasting international city tourism demand for Paris: Accuracy of uni- and multivariate models employing monthly data , 2015 .

[26]  Rob Law,et al.  Forecasting tourism demand with composite search index , 2017 .

[27]  F. Diebold,et al.  Comparing Predictive Accuracy , 1994, Business Cycles.

[28]  George E. P. Box,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[29]  Bing Pan,et al.  Travel queries on cities in the United States: Implications for search engine marketing for tourist destinations , 2011 .

[30]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[31]  M. Mccallum,et al.  Google search patterns suggest declining interest in the environment , 2013, Biodiversity and Conservation.