Understanding the Impact of Socio-Economic and Environmental Factors for Disease Outbreak in Developing Countries

The growing impact of disease outbreaks has emphasized the need for data mining and machine learning techniques for their analysis and prediction. To do this effectively an elaborate and reliable data management system is required. Unfortunately, such systems do not exist in many developing countries where the available information can be sparse and noisy with important factors missing from the data. In this paper, we report on the study of three diseases and their outbreaks in a developing country (Pakistan) with the goal of gaining a better understanding of environmental/weather and socio-economic factors that impact them. The data available from local health units contained only numbers of reported cases from different localities. We first enrich this data by fusing key environmental and socio-economic factors obtained from other sources. We then perform independent factor analysis of the augmented data, using decision tree and logistic regression. We study cross-disease and cross-locality impacts, and show the effectiveness of an outbreak prediction model. Our results highlight combinations of factors influencing disease outbreaks that can guide administrators towards their mitigation.

[1]  Eleftherios Mylonakis,et al.  Google trends: a web-based tool for real-time surveillance of disease outbreaks. , 2009, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[2]  Joseph N S Eisenberg,et al.  Heavy rainfall events and diarrhea incidence: the role of social and environmental factors. , 2014, American journal of epidemiology.

[3]  Robert E. Davis,et al.  The Impact of Weather on Influenza and Pneumonia Mortality in New York City, 1975–2002: A Retrospective Study , 2012, PloS one.

[4]  E Odongo-Aginya,et al.  Relationship between malaria infection intensity and rainfall pattern in Entebbe peninsula, Uganda. , 2005, African health sciences.

[5]  Azuraliza Abu Bakar,et al.  Malaysia Dengue Outbreak Detection Using Data Mining Models , 2013 .

[6]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[7]  Dean F. Sittig,et al.  The emerging science of very early detection of disease outbreaks. , 2001, Journal of public health management and practice : JPHMP.

[8]  R. Cornelis,et al.  Five years of sentinel surveillance of acute respiratory infections (1985–1990): The benefits of an influenza early warning system , 1992, European Journal of Epidemiology.

[9]  Ellsworth Huntington The Control of Pneumonia and Influenza by the Weather , 1920 .

[10]  Andrew W. Moore,et al.  What's Strange About Recent Events (WSARE): An Algorithm for the Early Detection of Disease Outbreaks , 2005, J. Mach. Learn. Res..

[11]  Umar Saif,et al.  FluBreaks: early epidemic detection from Google flu trends. , 2012, Journal of medical Internet research.

[12]  Yuhanis Yusof,et al.  Dengue Outbreak Prediction: A Least Squares Support Vector Machines Approach , 2011 .