Differential identifiability

A key challenge in privacy-preserving data mining is ensuring that a data mining result does not inherently violate privacy. ε-Differential Privacy appears to provide a solution to this problem. However, there are no clear guidelines on how to set ε to satisfy a privacy policy. We give an alternate formulation, Differential Identifiability, parameterized by the probability of individual identification. This provides the strong privacy guarantees of differential privacy, while letting policy makers set parameters based on the established privacy concept of individual identifiability.

[1]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[2]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[3]  Philip S. Yu,et al.  Differentially private data release for data mining , 2011, KDD.

[4]  Marianne Winslett,et al.  Differentially private data cubes: optimizing noise sources and consistency , 2011, SIGMOD '11.

[5]  Ning Zhang,et al.  Distributed Data Mining with Differential Privacy , 2011, 2011 IEEE International Conference on Communications (ICC).

[6]  Assaf Schuster,et al.  Data mining with differential privacy , 2010, KDD.

[7]  Chris Clifton,et al.  Privacy Preserving Data Mining (Advances in Information Security) , 2005 .

[8]  Graham Cormode,et al.  Personal privacy vs population privacy: learning to attack anonymization , 2011, KDD.

[9]  Chris Clifton,et al.  How Much Is Enough? Choosing ε for Differential Privacy , 2011, ISC.

[10]  Johannes Gehrke,et al.  Differential privacy via wavelet transforms , 2009, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[11]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[12]  Herbert Burkert,et al.  Some Preliminary Comments on the DIRECTIVE 95/46/EC OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. , 1996 .

[13]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[14]  Chris Clifton,et al.  Hiding the presence of individuals from shared databases , 2007, SIGMOD '07.

[15]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[16]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[17]  Philip S. Yu,et al.  Privacy-Preserving Data Mining - Models and Algorithms , 2008, Advances in Database Systems.

[18]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[19]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[20]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[21]  Adam D. Smith,et al.  Discovering frequent patterns in sensitive data , 2010, KDD.

[22]  H. Humphrey,et al.  Standards for privacy of individually identifiable health information. , 2003, Health care law monthly.