The Limits of Two-Party Differential Privacy

We study differential privacy in a distributed setting where two parties would like to perform analysis of their joint data while preserving privacy for both datasets. Our results imply almost tight lower bounds on the accuracy of such data analyses, both for specific natural functions (such as Hamming distance) and in general. Our bounds expose a sharp contrast between the two-party setting and the simpler client-server setting (where privacy guarantees are one-sided). In addition, those bounds demonstrate a dramatic gap between the accuracy that can be obtained by differentially private data analysis versus the accuracy obtainable when privacy is relaxed to a computational variant of differential privacy. The first proof technique we develop demonstrates a connection between differential privacy and deterministic extraction from Santha-Vazirani sources. A second connection we expose indicates that the ability to approximate a function by a low-error differentially private protocol is strongly related to the ability to approximate it by a low communication protocol. (The connection goes in both directions.)

[1]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[2]  M. Chao,et al.  Negative Moments of Positive Random Variables , 1972 .

[3]  Andrew Chi-Chih Yao,et al.  Some complexity questions related to distributive computing(Preliminary Report) , 1979, STOC.

[4]  Andrew Chi-Chih Yao,et al.  Protocols for Secure Computations (Extended Abstract) , 1982, FOCS.

[5]  Oded Goldreich,et al.  A randomized protocol for signing contracts , 1985, CACM.

[6]  Miklos Santha,et al.  Generating Quasi-random Sequences from Semi-random Sources , 1986, J. Comput. Syst. Sci..

[7]  Umesh V. Vazirani,et al.  Strong communication complexity or generating quasi-random sequences from two communicating semi-random sources , 1987, Comb..

[8]  Eyal Kushilevitz,et al.  A zero-one law for Boolean privacy , 1989, STOC '89.

[9]  Eyal Kushilevitz,et al.  Privacy and communication complexity , 1989, 30th Annual Symposium on Foundations of Computer Science.

[10]  Ilan Newman,et al.  Private vs. Common Random Bits in Communication Complexity , 1991, Inf. Process. Lett..

[11]  Udi Manber,et al.  Finding Similar Files in a Large File System , 1994, USENIX Winter.

[12]  Rafail Ostrovsky,et al.  Replication is not needed: single database, computationally-private information retrieval , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[13]  Andrew Chi-Chih Yao,et al.  Informational complexity and the direct sum problem for simultaneous message complexity , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[14]  Ziv Bar-Yossef,et al.  An information statistics approach to data stream and communication complexity , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[15]  Yevgeniy Dodis,et al.  On Extracting Private Randomness over a Public Channel , 2003, RANDOM-APPROX.

[16]  Cynthia Dwork,et al.  Privacy-Preserving Datamining on Vertically Partitioned Databases , 2004, CRYPTO.

[17]  Avi Wigderson,et al.  A note on ex-tracting randomness from Santha-Vazirani sources , 2004 .

[18]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[19]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[20]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[21]  Thomas Holenstein,et al.  Parallel repetition: simplifications and the no-signaling case , 2007, STOC '07.

[22]  Daniel A. Spielman,et al.  Spectral Graph Theory and its Applications , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[23]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[24]  Eran Omri,et al.  Distributed Private Data Analysis: On Simultaneously Solving How and What , 2008, CRYPTO.

[25]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[26]  Thomas Holenstein Parallel Repetition: Simplification and the No-Signaling Case , 2009, Theory Comput..

[27]  Tim Roughgarden,et al.  Universally utility-maximizing privacy mechanisms , 2008, STOC '09.

[28]  Omer Reingold,et al.  Computational Differential Privacy , 2009, CRYPTO.

[29]  Joan Feigenbaum,et al.  Approximate privacy: foundations and quantification (extended abstract) , 2010, EC '10.

[30]  Xi Chen,et al.  How to compress interactive communication , 2010, STOC '10.

[31]  Hartmut Klauck,et al.  The Partition Bound for Classical Communication Complexity and Query Complexity , 2009, 2010 IEEE 25th Annual Conference on Computational Complexity.

[32]  Moni Naor,et al.  Pan-Private Streaming Algorithms , 2010, ICS.

[33]  Aleksandar Nikolov,et al.  Pan-private Algorithms: When Memory Does Not Help , 2010, ArXiv.

[34]  Jonathan Katz,et al.  Limits of Computational Differential Privacy in the Client/Server Setting , 2011, TCC.

[35]  Amit Chakrabarti,et al.  An optimal lower bound on the communication complexity of gap-hamming-distance , 2010, STOC '11.

[36]  Anindya De,et al.  Lower Bounds in Differential Privacy , 2011, TCC.