Data Privacy against Composition Attack

Data anonymization has become a major technique in privacy preserving data publishing. Many methods have been proposed to anonymize one dataset and a series of datasets of a data holder. However, no method has been proposed for the anonymization scenario of multiple independent data publishing. A data holder publishes a dataset, which contains overlapping population with other datasets published by other independent data holders. No existing methods are able to protect privacy in such multiple independent data publishing. In this paper we propose a new generalization principle (ρ,α)-anonymization that effectively overcomes the privacy concerns for multiple independent data publishing. We also develop an effective algorithm to achieve the (ρ,α)-anonymization. We experimentally show that the proposed algorithm anonymizes data to satisfy the privacy requirement and preserves high quality data utility.

[1]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.

[2]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[3]  Sushil Jajodia,et al.  Checking for k-Anonymity Violation by Views , 2005, VLDB.

[4]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[5]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[6]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[7]  Chris Clifton,et al.  A secure distributed framework for achieving k-anonymity , 2006, The VLDB Journal.

[8]  Raymond Chi-Wing Wong,et al.  Minimality Attack in Privacy Preserving Data Publishing , 2007, VLDB.

[9]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[11]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[12]  Yufei Tao,et al.  On Anti-Corruption Privacy Preserving Publication , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[13]  Raymond Chi-Wing Wong,et al.  Anonymization by Local Recoding in Data with Attribute Hierarchical Taxonomies , 2008, IEEE Transactions on Knowledge and Data Engineering.

[14]  Bradley Malin,et al.  k-Unlinkability: A privacy protection model for distributed data , 2008, Data Knowl. Eng..

[15]  Hiroshi Nakagawa,et al.  Collusion-resistant privacy-preserving data mining , 2010, KDD.

[16]  Xin Jin,et al.  Versatile Publishing For Privacy Preservation (Technical Report) , 2010 .

[17]  Raymond Chi-Wing Wong,et al.  Global privacy guarantee in serial data publishing , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[18]  Philip S. Yu,et al.  Differentially private data release for data mining , 2011, KDD.

[19]  裕志 中川 Collusion-Resistant Privacy-Preserving Data Mining , 2013 .