Comparison of ARIMA and Random Forest time series models for prediction of avian influenza H5N1 outbreaks

BackgroundTime series models can play an important role in disease prediction. Incidence data can be used to predict the future occurrence of disease events. Developments in modeling approaches provide an opportunity to compare different time series models for predictive power.ResultsWe applied ARIMA and Random Forest time series models to incidence data of outbreaks of highly pathogenic avian influenza (H5N1) in Egypt, available through the online EMPRES-I system. We found that the Random Forest model outperformed the ARIMA model in predictive ability. Furthermore, we found that the Random Forest model is effective for predicting outbreaks of H5N1 in Egypt.ConclusionsRandom Forest time series modeling provides enhanced predictive ability over existing time series models for the prediction of infectious disease outbreaks. This result, along with those showing the concordance between bird and human outbreaks (Rabinowitz et al. 2012), provides a new approach to predicting these dangerous outbreaks in bird populations based on existing, freely available data. Our analysis uncovers the time-series structure of outbreak severity for highly pathogenic avain influenza (H5N1) in Egypt.

[1]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[2]  P. Rosenthal,et al.  Molecular markers of antifolate resistance in Plasmodium falciparum isolates from Luanda, Angola , 2011, Malaria Journal.

[3]  Keiko A. Herrick,et al.  A global model of avian influenza prediction in wild birds: the importance of northern regions , 2013, Veterinary Research.

[4]  Richard K. Kiang,et al.  Modeling and Predicting Seasonal Influenza Transmission in Warm Regions Using Climatological Parameters , 2010, PloS one.

[5]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[6]  Kenneth C. Earhart,et al.  Zoonotic Transmission of Avian Influenza Virus (H5N1), Egypt, 2006–2009 , 2010, Emerging infectious diseases.

[7]  Yongwimon Lenbury,et al.  Document heading doi : Modeling seasonal leptospirosis transmission and its association with rainfall and temperature in Thailand using time-series and ARIMAX analyses , 2012 .

[8]  M. Woolhouse,et al.  Ecological Origins of Novel Human Pathogens , 2007, Critical reviews in microbiology.

[9]  Mathieu Nacher,et al.  The role of El Niño southern oscillation (ENSO) on variations of monthly Plasmodium falciparum malaria cases at the cayenne general hospital, 1996-2009, French Guiana , 2010, Malaria Journal.

[10]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[11]  Matthew Scotch,et al.  Comparison of Human and Animal Surveillance Data for H5N1 Influenza A in Egypt 2006–2011 , 2012, PloS one.

[12]  M. Caley,et al.  Global Patterns and Predictions of Seafloor Biomass Using Random Forests , 2010, PloS one.

[13]  N. Ferguson,et al.  Epidemic and intervention modelling--a scientific rationale for policy decisions? Lessons from the 2009 influenza pandemic. , 2012, Bulletin of the World Health Organization.

[14]  J. Mills,et al.  PREDICTION OF PEROMYSCUS MANICULATUS (DEER MOUSE) POPULATION DYNAMICS IN MONTANA, USA, USING SATELLITE-DRIVEN VEGETATION PRODUCTIVITY AND WEATHER DATA , 2012, Journal of wildlife diseases.

[15]  Andrew Kusiak,et al.  A data-mining approach to predict influent quality , 2013, Environmental Monitoring and Assessment.

[16]  T. Bollerslev,et al.  Generalized autoregressive conditional heteroskedasticity , 1986 .

[17]  Yi Guan,et al.  Avian Influenza Virus (H5N1): a Threat to Human Health , 2007, Clinical Microbiology Reviews.

[18]  Aleksandra J. Snowden,et al.  Reduction in suicide mortality following a new national alcohol policy in Slovenia: an interrupted time-series analysis. , 2009, American journal of public health.

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[21]  A. Orcau,et al.  Monitoring mortality as an indicator of influenza in Catalonia, Spain. , 1996, Journal of epidemiology and community health.