Co-utile Collaborative Anonymization of Microdata

In surveys collecting individual data (microdata), each respondent is usually required to report values for a set of attributes. If some of these attributes contain sensitive information, the respondent must trust the collector not to make any inappropriate use of the data and, in case any data are to be publicly released, to properly anonymize them to avoid disclosing sensitive information. If the respondent does not trust the data collector, she may report inaccurately or report nothing at all. The reduce the need for trust, local anonymization is an alternative whereby each respondent anonymizes her data prior to sending them to the data collector. However, local anonymization by each respondent without seeing other respondents’ data makes it hard to find a good trade-off minimizing information loss and disclosure risk. We propose a distributed anonymization approach where users collaborate to attain an appropriate level of disclosure protection (and, thus, of information loss). Under our scheme, the final anonymized data are only as accurate as the information released by each respondent; hence, no trust needs to be assumed towards the data collector or any other respondent. Further, if respondents are interested in forming an accurate data set, the proposed collaborative anonymization protocols are self-enforcing and co-utile.

[1]  Oded Goldreich Foundations of Cryptography: Index , 2001 .

[2]  Josep Domingo-Ferrer,et al.  Statistical Disclosure Control: Hundepool/Statistical Disclosure Control , 2012 .

[3]  Jayant R. Haritsa,et al.  A Framework for High-Accuracy Privacy-Preserving Mining , 2005, ICDE.

[4]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[5]  Josep Domingo-Ferrer,et al.  Reverse Mapping to Preserve the Marginal Distributions of Attributes in Masked Microdata , 2014, Privacy in Statistical Databases.

[6]  Nick Mathewson,et al.  Tor: The Second-Generation Onion Router , 2004, USENIX Security Symposium.

[7]  Josep Domingo-Ferrer,et al.  New directions in anonymization: Permutation paradigm, verifiability by subjects and intruders, transparency to users , 2015, Inf. Sci..

[8]  Chris Clifton,et al.  Privacy-Preserving Distributed k-Anonymity , 2005, DBSec.

[9]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[10]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[11]  Josep Domingo-Ferrer,et al.  Co-utility: Self-enforcing protocols without coordination mechanisms , 2015, 2015 International Conference on Industrial Engineering and Operations Management (IEOM).

[12]  Benjamin C. M. Fung,et al.  Integrating Private Databases for Data Analysis , 2005, ISI.

[13]  Chris Clifton,et al.  A secure distributed framework for achieving k-anonymity , 2006, The VLDB Journal.

[14]  Josep Domingo-Ferrer,et al.  Probabilistic k-anonymity through microaggregation and data swapping , 2012, 2012 IEEE International Conference on Fuzzy Systems.

[15]  Li Xiong,et al.  Distributed Anonymization: Achieving Privacy for Both Data Subjects and Data Providers , 2009, DBSec.

[16]  Oded Goldreich,et al.  Foundations of Cryptography: List of Figures , 2001 .

[17]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[18]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[19]  Josep Domingo-Ferrer,et al.  Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation , 2005, Data Mining and Knowledge Discovery.

[20]  Tingjian Ge,et al.  Aroma: A New Data Protection Method with Differential Privacy and Accurate Query Answering , 2014, CIKM.

[21]  Yufei Tao,et al.  Personalized privacy preservation , 2006, Privacy-Preserving Data Mining.