Differential Privacy: A Survey of Results

Over the past five years a new approach to privacy-preserving data analysis has born fruit [13, 18, 7, 19, 5, 37, 35, 8, 32]. This approach differs from much (but not all!) of the related literature in the statistics, databases, theory, and cryptography communities, in that a formal and ad omnia privacy guarantee is defined, and the data analysis techniques presented are rigorously proved to satisfy the guarantee. The key privacy guarantee that has emerged is differential privacy. Roughly speaking, this ensures that (almost, and quantifiably) no risk is incurred by joining a statistical database. In this survey, we recall the definition of differential privacy and two basic techniques for achieving it. We then show some interesting applications of these techniques, presenting algorithms for three specific tasks and three general results on differentially private learning.

[1]  Ivan P. Fellegi,et al.  On the Question of Statistical Confidentiality , 1972 .

[2]  Peter J. Denning,et al.  The tracker: a threat to statistical database security , 1979, TODS.

[3]  James O. Achugbue,et al.  The Effectiveness Of Output Modification By Rounding For Protection Of Statistical Data Bases , 1979 .

[4]  Richard J. Lipton,et al.  Secure databases: protection against user influence , 1979, TODS.

[5]  Steven P. Reiss Practical Data-Swapping: The First Steps , 1980, 1980 IEEE Symposium on Security and Privacy.

[6]  Leland L. Beck,et al.  A security machanism for statistical database , 1980, TODS.

[7]  Dorothy E. Denning,et al.  Secure statistical databases with random sample queries , 1980, TODS.

[8]  Gultekin Özsoyoglu,et al.  Auditing and Inference Control in Statistical Databases , 1982, IEEE Transactions on Software Engineering.

[9]  Arie Shoshani,et al.  Statistical Databases: Characteristics, Problems, and some Solutions , 1982, VLDB.

[10]  Ezio Lefons,et al.  An Analytic Approach to Statistical Databases , 1983, VLDB.

[11]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[12]  Silvio Micali,et al.  Probabilistic Encryption , 1984, J. Comput. Syst. Sci..

[13]  Dan Gusfield,et al.  A Graph Theoretic Approach to Statistical Data Security , 1988, SIAM J. Comput..

[14]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[15]  L Sweeney,et al.  Weaving Technology and Policy Together to Maintain Confidentiality , 1997, Journal of Law, Medicine & Ethics.

[16]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[17]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[18]  Stephen E. Fienberg,et al.  Confidentiality and Data Protection Through Disclosure Limitation: Evolving Principles and Technical Advances , 2000 .

[19]  N. Smelser,et al.  International Encyclopedia of the Social and Behavioral Sciences , 2001 .

[20]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[21]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[22]  George T. Duncan,et al.  Confidentiality and Statistical Disclosure Limitations , 2001 .

[23]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[24]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[25]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[26]  L. Franconi,et al.  Implementing Statistical Disclosure Control For Aggregated Data Released Via Remote Access , 2003 .

[27]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[28]  Jerome P. Reiter,et al.  Multiple Imputation for Statistical Disclosure Limitation , 2003 .

[29]  Cynthia Dwork,et al.  Privacy-Preserving Datamining on Vertically Partitioned Databases , 2004, CRYPTO.

[30]  Matthew Franklin,et al.  Advances in Cryptology – CRYPTO 2004 , 2004, Lecture Notes in Computer Science.

[31]  Hoeteck Wee,et al.  Toward Privacy in Public Databases , 2005, TCC.

[32]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[33]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[34]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[35]  Serge Vaudenay,et al.  Advances in Cryptology - EUROCRYPT 2006 , 2006, Lecture Notes in Computer Science.

[36]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[37]  Vitaly Shmatikov,et al.  How To Break Anonymity of the Netflix Prize Dataset , 2006, ArXiv.

[38]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[39]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[40]  Cynthia Dwork,et al.  The price of privacy and the limits of LP decoding , 2007, STOC '07.

[41]  Daniel A. Spielman,et al.  Spectral Graph Theory and its Applications , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[42]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[43]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[44]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[45]  A. Blum,et al.  A learning theory approach to non-interactive database privacy , 2008, STOC.

[46]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[47]  Cynthia Dwork,et al.  New Efficient Attacks on Statistical Disclosure Control Mechanisms , 2008, CRYPTO.