Using Machine Learning to Predict Swine Movements within a Regional Program to Improve Control of Infectious Diseases in the US

Between-farm animal movement is one of the most important factors influencing the spread of infectious diseases in food animals, including in the US swine industry. Understanding the structural network of contacts in a food animal industry is prerequisite to planning for efficient production strategies and for effective disease control measures. Unfortunately, data regarding between-farm animal movements in the US are not systematically collected and thus, such information is often unavailable. In this paper, we develop a procedure to replicate the structure of a network, making use of partial data available, and subsequently use the model developed to predict animal movements among sites in 34 Minnesota counties. First, we summarized two networks of swine producing facilities in Minnesota, then we used a machine learning technique referred to as random forest, an ensemble of independent classification trees, to estimate the probability of pig movements between farms and/or markets sites located in two counties in Minnesota. The model was calibrated and tested by comparing predicted data and observed data in those two counties for which data were available. Finally, the model was used to predict animal movements in sites located across 34 Minnesota counties. Variables that were important in predicting pig movements included between-site distance, ownership, and production type of the sending and receiving farms and/or markets. Using a weighted-kernel approach to describe spatial variation in the centrality measures of the predicted network, we showed that the south-central region of the study area exhibited high aggregation of predicted pig movements. Our results show an overlap with the distribution of outbreaks of porcine reproductive and respiratory syndrome, which is believed to be transmitted, at least in part, though animal movements. While the correspondence of movements and disease is not a causal test, it suggests that the predicted network may approximate actual movements. Accordingly, the predictions provided here might help to design and implement control strategies in the region. Additionally, the methodology here may be used to estimate contact networks for other livestock systems when only incomplete information regarding animal movements is available.

[1]  E. Fèvre,et al.  Animal movements and the spread of infectious diseases , 2006, Trends in Microbiology.

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  Ming Wang,et al.  Prevalence and molecular characterization of Cryptosporidium spp. and Giardia duodenalis in dairy cattle in Ningxia, northwestern China , 2014, BMC Veterinary Research.

[4]  B. Martínez-López,et al.  Social network analysis. Review of general concepts and use in preventive veterinary medicine. , 2009, Transboundary and emerging diseases.

[5]  S. Dee,et al.  An evaluation of a liquid antimicrobial (Sal CURB®) for reducing the risk of porcine epidemic diarrhea virus infection of naïve pigs during consumption of contaminated feed , 2014, BMC Veterinary Research.

[6]  C. Hurt Industrialization in the Pork Industry , 1994 .

[7]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[8]  S. Dee,et al.  Mechanical transmission of porcine reproductive and respiratory syndrome virus throughout a coordinated sequence of events during warm weather. , 2003, Canadian journal of veterinary research = Revue canadienne de recherche veterinaire.

[9]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[10]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[11]  Meggan E Craft,et al.  Network analysis of cattle movements in Uruguay: Quantifying heterogeneity for risk-based disease surveillance and control. , 2016, Preventive veterinary medicine.

[12]  Leo Breiman,et al.  Using Iterated Bagging to Debias Regressions , 2001, Machine Learning.

[13]  B. Durand,et al.  Structural vulnerability of the French swine industry trade network to the spread of infectious diseases. , 2012, Animal : an international journal of animal bioscience.

[14]  Montserrat Torremorell,et al.  Control and elimination of porcine reproductive and respiratory syndrome virus. , 2010, Virus research.

[15]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[16]  Tarn Duong,et al.  ks: Kernel Density Estimation and Kernel Discriminant Analysis for Multivariate Data in R , 2007 .

[17]  Max Kuhn,et al.  caret: Classification and Regression Training , 2015 .

[18]  S. Dee,et al.  An evaluation of disinfectants for the sanitation of porcine reproductive and respiratory syndrome virus-contaminated transport vehicles at cold temperatures. , 2005, Canadian journal of veterinary research = Revue canadienne de recherche veterinaire.

[19]  M. Nöremark,et al.  Network analysis of cattle and pig movements in Sweden: measures relevant for disease control and risk based surveillance. , 2011, Preventive veterinary medicine.

[20]  Alain Barrat,et al.  Optimizing surveillance for livestock disease spreading through animal movements , 2012, Journal of The Royal Society Interface.

[21]  Spencer R. Wayne Assessment of the demographics and network structure of swine populations in relation to regional disease transmission and control. , 2011 .

[22]  W. McBride,et al.  The Changing Economics of U.S. Hog Production , 2007 .

[23]  E. Albina,et al.  Epidemiology of porcine reproductive and respiratory syndrome (PRRS): an overview. , 1997, Veterinary microbiology.

[24]  Carolyn R. Bertozzi,et al.  Methods and Applications , 2009 .

[25]  Submitted Article Emergency Vaccination to Control Foot-and- mouth Disease: Implications of its Inclusion as a U.S. Policy Option , 2012 .

[26]  I Traulsen,et al.  Epidemic Spreading in an Animal Trade Network - Comparison of Distance-Based and Network-Based Control Measures. , 2016, Transboundary and emerging diseases.

[27]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  P. Mahadevan,et al.  An overview , 2007, Journal of Biosciences.

[30]  S. Dee,et al.  An evaluation of contaminated complete feed as a vehicle for porcine epidemic diarrhea virus infection of naïve pigs following consumption via natural feeding behavior: proof of concept , 2014, BMC Veterinary Research.

[31]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[32]  P. Nelson,et al.  Theory of high-force DNA stretching and overstretching. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  J. Macdonald,et al.  The Transformation of U.S. Livestock Agriculture Scale, Efficiency, and Risks , 2009 .

[34]  Philipp Hövel,et al.  Disease Spread through Animal Movements: A Static and Temporal Network Analysis of Pig Trade in Germany , 2016, PloS one.

[35]  B. Martínez-López,et al.  The role of fish movements and the spread of infectious salmon anemia virus (ISAV) in Chile, 2007-2009. , 2014, Preventive veterinary medicine.

[36]  R. Morrison,et al.  Novel analytic tools for the study of porcine reproductive and respiratory syndrome virus (PRRSv) in endemic settings: lessons learned in the U.S. , 2016, Porcine health management.

[37]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[38]  Andres M. Perez,et al.  Measuring Progress on the Control of Porcine Reproductive and Respiratory Syndrome (PRRS) at a Regional Level: The Minnesota N212 Regional Control Project (Rcp) as a Working Example , 2016, PloS one.

[39]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[40]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[41]  Melanie Volkamer,et al.  Proof of Concept , 2009 .

[42]  S. Polasky,et al.  Agricultural sustainability and intensive production practices , 2002, Nature.

[43]  Zvonimir Poljak,et al.  Lessons learned and knowledge gaps about the epidemiology and control of porcine reproductive and respiratory syndrome virus in North America. , 2015, Journal of the American Veterinary Medical Association.

[44]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[45]  William D. McBride,et al.  Economic and Structural Relationships in U.S. Hog Production , 2003 .

[46]  Satoshi Otake,et al.  An experimental model to evaluate the role of transport vehicles as a source of transmission of porcine reproductive and respiratory syndrome virus to susceptible pigs. , 2004, Canadian journal of veterinary research = Revue canadienne de recherche veterinaire.

[47]  Chao Chen,et al.  Using Random Forest to Learn Imbalanced Data , 2004 .

[48]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[49]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[50]  Andreas Ziegler,et al.  ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R , 2015, 1508.04409.

[51]  F. Natale,et al.  Network analysis of Italian cattle trade patterns and evaluation of risks for potential disease spread. , 2009, Preventive Veterinary Medicine.

[52]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[53]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .