On the Difficulties of Disclosure Prevention in Statistical Databases or The Case for Differential Privacy

In 1977 Tore Dalenius articulated a desideratum for statistical databases: nothing about an individual should be learnable from the database that cannot be learned without access to the database. We give a general impossibility result showing that a natural formalization of Dalenius’ goal cannot be achieved if the database is useful. The key obstacle is the side information that may be available to an adversary. Our results hold under very general conditions regarding the database, the notion of privacy violation, and the notion of utility. Contrary to intuition, a variant of the result threatens the privacy even of someone not in the database. This state of affairs motivated the notion of differential privacy [15, 16], a strong ad omnia privacy which, intuitively, captures the increased risk to one’s privacy incurred by participating in a database.

[1]  Ronen Shaltiel,et al.  Recent Developments in Explicit Constructions of Extractors , 2002, Bull. EATCS.

[2]  Rafail Ostrovsky,et al.  Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data , 2004, SIAM J. Comput..

[3]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[4]  Silvio Micali,et al.  A Digital Signature Scheme Secure Against Adaptive Chosen-Message Attacks , 1988, SIAM J. Comput..

[5]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[6]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[7]  Oded Goldreich,et al.  Modern Cryptography, Probabilistic Proofs and Pseudorandomness , 1998, Algorithms and Combinatorics.

[8]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[9]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, FOCS.

[10]  Oded Goldreich,et al.  Foundations of Cryptography: Basic Tools , 2000 .

[11]  Cynthia Dwork,et al.  Privacy-Preserving Datamining on Vertically Partitioned Databases , 2004, CRYPTO.

[12]  L Sweeney,et al.  Weaving Technology and Policy Together to Maintain Confidentiality , 1997, Journal of Law, Medicine & Ethics.

[13]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[14]  Dorothy E. Denning,et al.  Secure statistical databases with random sample queries , 1980, TODS.

[15]  Jin H. Im,et al.  Privacy , 2002, Encyclopedia of Information Systems.

[16]  Silvio Micali,et al.  Probabilistic Encryption , 1984, J. Comput. Syst. Sci..

[17]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[18]  Cynthia Dwork,et al.  On Privacy-Preserving Histograms , 2005, UAI.

[19]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[20]  C. Dwork,et al.  On the Utility of Privacy-Preserving Histograms , 2004 .

[21]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[22]  Leonid A. Levin,et al.  Pseudo-random generation from one-way functions , 1989, STOC '89.

[23]  Leonid A. Levin,et al.  Pseudo-random Generation from one-way functions (Extended Abstracts) , 1989, STOC 1989.

[24]  Hoeteck Wee,et al.  Toward Privacy in Public Databases , 2005, TCC.

[25]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[26]  Richard J. Lipton,et al.  Secure databases: protection against user influence , 1979, TODS.

[27]  Avi Wigderson,et al.  Tiny Families of Functions with Random Properties: A Quality-Size Trade-off for Hashing , 1997, Electron. Colloquium Comput. Complex..

[28]  Yevgeniy Dodis,et al.  Correcting errors without leaking partial information , 2005, STOC '05.

[29]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[30]  Oded Goldreich,et al.  Definitions and properties of zero-knowledge proof systems , 1994, Journal of Cryptology.

[31]  Aaron Roth,et al.  A learning theory approach to noninteractive database privacy , 2011, JACM.

[32]  Noam Nisan,et al.  Randomness is Linear in Space , 1996, J. Comput. Syst. Sci..

[33]  GoldreichOded,et al.  Definitions and properties of zero-knowledge proof systems , 1994 .

[34]  Aravind Srinivasan,et al.  Computing with very weak random sources , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[35]  Amit Sahai,et al.  On the (im)possibility of obfuscating programs , 2001, JACM.