Current developments of k-anonymous data releasing

Disclosure-control is a traditional statistical methodology for protecting privacy when data is released for analysis. Disclosure-control methods have enjoyed a revival in the data mining community, especially after the introduction of the k-anonymity model by Samarati and Sweeney. Algorithmic advances on k-anonymisation provide simple and effective approaches to protect private information of individuals via only releasing k-anonymous views of a data set. Thus, the k-anonymity model has gained increasing popularity. Recent research identifies some drawbacks of the k-anonymity model and presents enhanced kanonymity models. This paper reviews problems of the k-anonymity model and its enhanced variants, and different methods for implementing k-anonymity. It compares the k-anonymity model with the secure multiparty computation-based privacy-preserving techniques in the data mining literature. The paper also discusses further development directions of the k-anonymous data releasing.

[1]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[2]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[3]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[4]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[5]  Ran Wolff,et al.  k-TTP: a new privacy model for large-scale distributed environments , 2004, KDD.

[6]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[7]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[8]  Chris Clifton,et al.  When do data mining results violate privacy? , 2004, KDD.

[9]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[10]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[11]  Luís Torgo,et al.  Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases , 2005 .

[12]  Adam Meyerson,et al.  On the complexity of optimal K-anonymity , 2004, PODS.

[13]  Silvio Micali,et al.  How to play ANY mental game , 1987, STOC.

[14]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[15]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[16]  Philip S. Yu,et al.  Top-down specialization for information and privacy preservation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[17]  Chris Clifton,et al.  Privacy-preserving data mining: why, how, and when , 2004, IEEE Security & Privacy Magazine.

[18]  Sushil Jajodia,et al.  The inference problem: a survey , 2002, SKDD.

[19]  Philip S. Yu,et al.  Bottom-up generalization: a data mining solution to privacy protection , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[20]  Ton de Waal,et al.  Statistical Disclosure Control in Practice , 1996 .

[21]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[22]  Philip S. Yu,et al.  Template-based privacy preservation in classification problems , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[23]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[24]  Jayant R. Haritsa,et al.  Maintaining Data Privacy in Association Rule Mining , 2002, VLDB.

[25]  Josep Domingo-Ferrer,et al.  Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation , 2005, Data Mining and Knowledge Discovery.

[26]  Dino Pedreschi,et al.  k-Anonymous Patterns , 2005, PKDD.

[27]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[28]  L. Cox Suppression Methodology and Statistical Disclosure Control , 1980 .

[29]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[30]  Rebecca N. Wright,et al.  Privacy-preserving Bayesian network structure computation on distributed heterogeneous data , 2004, KDD.

[31]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[32]  Josep Domingo-Ferrer,et al.  Practical Data-Oriented Microaggregation for Statistical Disclosure Control , 2002, IEEE Trans. Knowl. Data Eng..

[33]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[34]  Rajeev Motwani,et al.  Anonymizing Tables , 2005, ICDT.

[35]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[36]  Charu C. Aggarwal,et al.  On k-Anonymity and the Curse of Dimensionality , 2005, VLDB.

[37]  Raymond Chi-Wing Wong,et al.  Achieving k-Anonymity by Clustering in Attribute Hierarchical Structures , 2006, DaWaK.