Privacy Preserving k-Anonymity for Re-publication of Incremental Datasets

Most of the previous works on k-anonymization focused on one-time release of data. However, data is often released continuously to serve various information purposes in reality. The purpose of this study is to develop an effective solution for the re-publication of incremental datasets. First, we analyze several possible generalizations in the anonymization for incremental updates and propose an important monotonic generalization principle that effectively prevents privacy breach in re-publication. Based on the monotonic generalization principle, we then propose a partitioning based algorithm for re-publication, which can securely anonymize a continuously growing dataset in an efficient manner while assuring high data quality. The effectiveness of our approach is confirmed by extensive experiments with real data.

[1]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[2]  Adam Meyerson,et al.  On the complexity of optimal K-anonymity , 2004, PODS.

[3]  Rajeev Motwani,et al.  Approximation Algorithms for k-Anonymity , 2005 .

[4]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[5]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[6]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[7]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[8]  Jian Pei,et al.  Maintaining K-Anonymity against Incremental Updates , 2007, 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007).

[9]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[10]  Elisa Bertino,et al.  Secure Anonymization for Incremental Datasets , 2006, Secure Data Management.

[11]  Kyuseok Shim,et al.  Approximate algorithms for K-anonymity , 2007, SIGMOD '07.

[12]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[13]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..