δ-Presence without Complete World Knowledge

Advances in information technology, and its use in research, are increasing both the need for anonymized data and the risks of poor anonymization. We presented a new privacy metric, δ-presence, that clearly links the quality of anonymization to the risk posed by inadequate anonymization. It was shown that existing anonymization techniques are inappropriate for situations where δ-presence is a good metric (specifically, where knowing an individual is in the database poses a privacy risk). This article addresses a practical problem with, extending to situations where the data anonymizer is not assumed to have complete world knowledge. The algorithms are evaluated in the context of a real-world scenario, demonstrating practical applicability of the approach.

[1]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[2]  Charles M. Grinstead,et al.  Introduction to probability , 1999, Statistics for the Behavioural Sciences.

[3]  Samir Khuller,et al.  Achieving anonymity via clustering , 2006, PODS '06.

[4]  Chris Clifton,et al.  Hiding the presence of individuals from shared databases , 2007, SIGMOD '07.

[5]  Tapabrata Maiti,et al.  Normal approximation to the hypergeometric distribution in nonstandard cases and a sub-Gaussian Berry–Esseen theorem , 2007 .

[6]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[7]  Yufei Tao,et al.  On Anti-Corruption Privacy Preserving Publication , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[8]  Suleyman Cetintas,et al.  GENERALIZATIONS WITH PROBABILITY DISTRIBUTIONS FOR DATA ANONYMIZATION , 2008 .

[9]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[10]  Qing Zhang,et al.  Aggregate Query Answering on Anonymized Tables , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  H. Humphrey,et al.  Standards for privacy of individually identifiable health information. , 2003, Health care law monthly.

[12]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[13]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[14]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[15]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[16]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[17]  Charu C. Aggarwal,et al.  On k-Anonymity and the Curse of Dimensionality , 2005, VLDB.

[18]  Lucila Ohno-Machado,et al.  Using Boolean reasoning to anonymize databases , 1999, Artif. Intell. Medicine.

[19]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[20]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[21]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.