Retrospective analysis of the possibility of predicting the COVID-19 outbreak from Internet searches and social media data, China, 2020

The peak of Internet searches and social media data about the coronavirus disease 2019 (COVID-19) outbreak occurred 10–14 days earlier than the peak of daily incidences in China. Internet searches and social media data had high correlation with daily incidences, with the maximum r > 0.89 in all correlations. The lag correlations also showed a maximum correlation at 8–12 days for laboratory-confirmed cases and 6–8 days for suspected cases.

[1]  Thomas S. Higgins,et al.  Correlations of Online Search Engine Trends With Coronavirus Disease (COVID-19) Incidence: Infodemiology Study , 2020, JMIR public health and surveillance.

[2]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[3]  O. E. Santangelo,et al.  Digital epidemiology: assessment of measles infection through Google Trends mechanism in Italy. , 2019, Annali di igiene : medicina preventiva e di comunita.

[4]  N. Wilson,et al.  Interpreting Google flu trends data for pandemic H1N1 influenza: the New Zealand experience. , 2009, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[5]  A. Bhagavathula,et al.  COVID-19-Related Web Search Behaviors and Infodemic Attitudes in Italy: Infodemiological Study , 2020, JMIR public health and surveillance.

[6]  Xiaoling Yuan,et al.  Trends and prediction in daily incidence and deaths of COVID-19 in the United States: a search-interest based model , 2020, medRxiv.

[7]  Wagner Meira,et al.  Dengue prediction by the web: Tweets are a useful tool for estimating and forecasting Dengue at country and city level , 2017, PLoS neglected tropical diseases.

[8]  Mauricio Santillana,et al.  Utilizing Nontraditional Data Sources for Near Real-Time Estimation of Transmission Dynamics During the 2015-2016 Colombian Zika Virus Disease Outbreak , 2016, JMIR public health and surveillance.

[9]  G. Leung,et al.  Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study , 2020, The Lancet.

[10]  Xiaoling Yuan,et al.  Trends and Prediction in Daily New Cases and Deaths of COVID-19 in the United States: An Internet Search-Interest Based Model , 2020, Exploratory research and hypothesis in medicine.

[11]  中国疾病预防控制中心新型冠状病毒肺炎应急响应机制流行病学组 The Novel Coronavirus Pneumonia Emergency Response Epidemiology Team. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China./ 新型冠状病毒肺炎流行病学特征分析 , 2020 .

[12]  T. Mackey,et al.  Machine Learning to Detect Self-Reporting of Symptoms, Testing Access, and Recovery Associated With COVID-19 on Twitter: Retrospective Big Data Infoveillance Study , 2020, JMIR public health and surveillance.

[13]  Novel Coronavirus Pneumonia Emergency Response Epidemiol Team [The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China]. , 2020, Zhonghua liu xing bing xue za zhi = Zhonghua liuxingbingxue zazhi.

[14]  Jisun An,et al.  High correlation of Middle East respiratory syndrome spread with Google search and Twitter trends in Korea , 2016, Scientific Reports.