How Anonymous Is k-Anonymous? Look at Your Quasi-ID

The concept of quasi-ID (QI) is fundamental to the notion of k-anonymity that has gained popularity recently as a privacy-preserving method in microdata publication. This paper shows that it is important to provide QI with a formal underpinning, which, surprisingly, has been generally absent in the literature. The study presented in this paper provides a first look at the correct and incorrect uses of QI in k-anonymization processes and exposes the implicit conservative assumptions when QI is used correctly. The original notions introduced in this paper include (1) k-anonymity under the assumption of a formally defined external information source, independent of the QI notion, and (2) k-QI, which is an extension of the traditional QI and is shown to be a necessary refinement. The concept of k-anonymity defined in a world without using QI is an interesting artifact itself, but more importantly, it provides a sound framework to gauge the use of QI for k-anonymization.

[1]  Tamir Tassa,et al.  k-Anonymization Revisited , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[2]  Sushil Jajodia,et al.  Indistinguishability: The Other Aspect of Privacy , 2006, Secure Data Management.

[3]  Hamid Pirahesh,et al.  The Magic of Duplicates and Aggregates , 1990, VLDB.

[4]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[5]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[6]  Andreas Pfitzmann,et al.  Anonymity, Unobservability, and Pseudonymity - A Proposal for Terminology , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[7]  Adam Meyerson,et al.  On the complexity of optimal K-anonymity , 2004, PODS.

[8]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[9]  Sushil Jajodia,et al.  The Role of Quasi-identifiers in k-Anonymity Revisited , 2006, ArXiv.

[10]  Hannes Federrath Designing Privacy Enhancing Technologies , 2001, Lecture Notes in Computer Science.

[11]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[12]  David J. DeWitt,et al.  Workload-aware anonymization , 2006, KDD '06.

[13]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[14]  Philip S. Yu,et al.  Handicapping attacker's confidence: an alternative to k-anonymization , 2006, Knowledge and Information Systems.

[15]  David Chaum,et al.  The dining cryptographers problem: Unconditional sender and recipient untraceability , 1988, Journal of Cryptology.