Assessment of autoregressive integrated moving average (ARIMA), generalized linear autoregressive moving average (GLARMA), and random forest (RF) time series regression models for predicting influenza A virus frequency in swine in Ontario, Canada

Influenza A virus commonly circulating in swine (IAV-S) is characterized by large genetic and antigenic diversity and, thus, improvements in different aspects of IAV-S surveillance are needed to achieve desirable goals of surveillance such as to establish the capacity to forecast with the greatest accuracy the number of influenza cases likely to arise. Advancements in modeling approaches provide the opportunity to use different models for surveillance. However, in order to make improvements in surveillance, it is necessary to assess the predictive ability of such models. This study compares the sensitivity and predictive accuracy of the autoregressive integrated moving average (ARIMA) model, the generalized linear autoregressive moving average (GLARMA) model, and the random forest (RF) model with respect to the frequency of influenza A virus (IAV) in Ontario swine. Diagnostic data on IAV submissions in Ontario swine between 2007 and 2015 were obtained from the Animal Health Laboratory (University of Guelph, Guelph, ON, Canada). Each modeling approach was examined for predictive accuracy, evaluated by the root mean square error, the normalized root mean square error, and the model’s ability to anticipate increases and decreases in disease frequency. Likewise, we verified the magnitude of improvement offered by the ARIMA, GLARMA and RF models over a seasonal-naïve method. Using the diagnostic submissions, the occurrence of seasonality and the long-term trend in IAV infections were also investigated. The RF model had the smallest root mean square error in the prospective analysis and tended to predict increases in the number of diagnostic submissions and positive virological submissions at weekly and monthly intervals with a higher degree of sensitivity than the ARIMA and GLARMA models. The number of weekly positive virological submissions is significantly higher in the fall calendar season compared to the summer calendar season. Positive counts at weekly and monthly intervals demonstrated a significant increasing trend. Overall, this study shows that the RF model offers enhanced prediction ability over the ARIMA and GLARMA time series models for predicting the frequency of IAV infections in diagnostic submissions.

[1]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[2]  J. Zimmerman,et al.  Diseases Of Swine 10th Ed. , 2012 .

[3]  M. Peiris,et al.  Epidemiological features of influenza circulation in swine populations: A systematic review and meta-analysis , 2017, PloS one.

[4]  Paul A. Bromiley,et al.  Robust and Accurate Shape Model Matching Using Random Forest Regression-Voting , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Assessment of seasonality of influenza in swine using field submissions to a diagnostic laboratory in Ontario between 2007 and 2012 , 2014, Influenza and other respiratory viruses.

[6]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Kathleen A. Smith,et al.  Investigation of an Outbreak of Variant Influenza A(H3N2) Virus Infection Associated With an Agricultural Fair-Ohio, August 2012. , 2015, The Journal of infectious diseases.

[9]  Molecular characterization of H3N2 influenza A viruses isolated from Ontario swine in 2011 and 2012 , 2014, Virology Journal.

[10]  Francisco Herrera,et al.  Study on the Impact of Partition-Induced Dataset Shift on $k$-Fold Cross-Validation , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Francis E. Warnock,et al.  Capital Flow Waves: Surges, Stops, Flight, and Retrenchment , 2011 .

[12]  E. Spackman Animal Influenza Virus , 2018, Methods in Molecular Biology.

[13]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[14]  Tavis K. Anderson,et al.  A brief introduction to influenza A virus in swine. , 2014, Methods in molecular biology.

[15]  Z. Poljak,et al.  Genetic Characterization of H1N1 and H1N2 Influenza A Viruses Circulating in Ontario Pigs in 2012 , 2015, PloS one.

[16]  Tavis K. Anderson,et al.  The Molecular Determinants of Antibody Recognition and Antigenic Drift in the H3 Hemagglutinin of Swine Influenza A Virus , 2016, Journal of Virology.

[17]  Matthew Scotch,et al.  Comparison of ARIMA and Random Forest time series models for prediction of avian influenza H5N1 outbreaks , 2014, BMC Bioinformatics.

[18]  J. Rainey,et al.  Comparing Observed with Predicted Weekly Influenza-Like Illness Rates during the Winter Holiday Break, United States, 2004-2013 , 2015, PloS one.

[19]  James W. Taylor,et al.  A Comparison of Univariate Time Series Methods for Forecasting Intraday Arrivals at a Call Center , 2008, Manag. Sci..

[20]  D. Pfeiffer,et al.  Spatiotemporal trends in the discovery of new swine infectious agents , 2015, Veterinary Research.

[21]  Ireneous N. Soyiri,et al.  Evolving forecasting classifications and applications in health forecasting , 2012, International journal of general medicine.