Improving Influenza Forecasting with Web-Based Social Data

Improving seasonal influenza forecasting combining official data sources with web search and social media is a recent research topic which can enhance situational awareness of healthcare organizations when monitoring the outbreak of seasonal flu. In this paper, a prediction model based on autoregression that combines data coming from official influenza surveillance system, with data from web search and social media regarding influenza is proposed. The model is evaluated on the two influenza seasons 2016–2017 and 2017–2018, restricted to Italy. The results show that by using Web-based social data, like Google search queries and tweets, we can obtain accurate weekly influenza predictions up to four weeks in advance. The proposed approach improves real-time influenza forecast compared to traditional surveillance systems based on data from sentinel doctors: the prediction error is reduced up to 47%, while the Pearson's correlation is improved of about 24%.