An approach for de-identification of point locations of livestock premises for further use in disease spread modeling.

We describe a method for de-identifying point location data used for disease spread modeling to allow data custodians to share data with modeling experts without disclosing individual farm identities. The approach is implemented in an open-source software program that is described and evaluated here. The program allows a data custodian to select a level of de-identification based on the K-anonymity statistic. The program converts a file of true farm locations and attributes into a file appropriate for use in disease spread modeling with the locations randomly modified to prevent re-identification based on location. Important epidemiological relationships such as clustering are preserved to as much as possible to allow modeling similar to those using true identifiable data. The software implementation was verified by visual inspection and basic descriptive spatial analysis of the output. Performance is sufficient to allow de-identification of even large data sets on desktop computers available to any data custodian.

[1]  C Melius Developing Poultry Facility Type Information from USDA Agricultural Census Data for Use in Epidemiological and Economic Models , 2007 .

[2]  M. Ward,et al.  Representation of animal distributions in space: how geostatistical estimates impact simulation modeling of foot-and-mouth disease spread. , 2008, Veterinary research.

[3]  G. Rushton,et al.  Geographically masking health data to preserve confidentiality. , 1999, Statistics in medicine.

[4]  M. Ward,et al.  Modelling spread of foot-and-mouth disease in wild white-tailed deer and feral pig populations using a geographic-automata model and animal distributions. , 2009, Preventive veterinary medicine.

[5]  P. Haase Spatial pattern analysis in ecology based on Ripley's K-function: Introduction and methods of edge correction , 1995 .

[6]  Michael J. Tildesley,et al.  Disease Prevention versus Data Privacy: Using Landcover Maps to Inform Spatial Epidemic Models , 2012, PLoS Comput. Biol..

[7]  Andrew B. Lawson,et al.  An evaluation of the edge effects in disease map modelling , 2005, Comput. Stat. Data Anal..

[8]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[9]  Nick Taylor,et al.  Review of the use of models in informing disease control policy development and adjustment. , 2003 .

[10]  T E Carpenter,et al.  Stochastic, spatially-explicit epidemic models. , 2011, Revue scientifique et technique.

[11]  Kenneth D Mandl,et al.  Privacy protection versus cluster detection in spatial epidemiology. , 2006, American journal of public health.

[12]  J. Hokanson,et al.  An epidemiologic simulation model of the spread and control of highly pathogenic avian influenza (H5N1) among commercial and backyard poultry flocks in South Carolina, United States. , 2013, Preventive veterinary medicine.

[13]  M. G. Garner,et al.  How do resources influence control measures during a simulated outbreak of foot and mouth disease in Australia? , 2014, Preventive veterinary medicine.

[14]  M. Keeling,et al.  Impact of spatial clustering on disease transmission and optimal control , 2009, Proceedings of the National Academy of Sciences.

[15]  Christopher K. Wikle,et al.  Hierarchical Bayesian Models for Predicting The Spread of Ecological Processes , 2003 .

[16]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[17]  M. G. Garner,et al.  Modelling the spread of foot-and-mouth disease in Australia. , 2005, Australian veterinary journal.

[18]  R S Morris,et al.  InterSpread Plus: a spatial and stochastic simulation model of disease in animal populations. , 2013, Preventive veterinary medicine.

[19]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[20]  A. Hill,et al.  The North American Animal Disease Spread Model: a simulation model to assist decision making in evaluating animal disease incursions. , 2007, Preventive veterinary medicine.

[21]  M. Woolhouse,et al.  Vaccination against Foot-And-Mouth Disease: Do Initial Conditions Affect Its Benefit? , 2013, PloS one.

[22]  Michael P Ward,et al.  Simulation of foot-and-mouth disease spread within an integrated livestock system in Texas, USA. , 2009, Preventive veterinary medicine.

[23]  Khaled El Emam,et al.  Protecting privacy using k-anonymity. , 2008, Journal of the American Medical Informatics Association : JAMIA.

[24]  B. Ripley The Second-Order Analysis of Stationary Point Processes , 1976 .

[25]  M. Woolhouse,et al.  Potential for epidemic take-off from the primary outbreak farm via livestock movements , 2011, BMC veterinary research.

[26]  Georges G. Grinstein,et al.  Intelligently resolving point occlusion , 2003, IEEE Symposium on Information Visualization 2003 (IEEE Cat. No.03TH8714).

[27]  J. Marc Overhage,et al.  Application of Information Technology: A Context-sensitive Approach to Anonymizing Spatial Surveillance Data: Impact on Outbreak Detection , 2006, J. Am. Medical Informatics Assoc..

[28]  Josep Domingo-Ferrer,et al.  Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation , 2005, Data Mining and Knowledge Discovery.

[29]  Stefan Steiniger,et al.  The 2012 free and open source GIS software map - A guide to facilitate research, development, and adoption , 2013, Comput. Environ. Urban Syst..

[30]  Synthesized Population Databases: A Geospatial Database of US Poultry Farms. , 2012, Methods report.

[31]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[32]  Kent Beck,et al.  Test-infected: programmers love writing tests , 2000 .

[33]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[34]  William B Allshouse,et al.  Practice of Epidemiology Mapping Health Data: Improved Privacy Protection With Donut Method Geomasking , 2010 .