The Limits of Two-party Differential Privacy

—We study differential privacy in a distributed setting where two parties would like to perform analysis of their joint data while preserving privacy for both datasets. Our results imply almost tight lower bounds on the accuracy of such data analyses, both for specific natural functions (such as Hamming distance) and in general. Our bounds expose a sharp contrast between the two-party setting and the simpler client-server setting (where privacy guarantees are one-sided). In addition, those bounds demonstrate a dramatic gap between the accuracy that can be obtained by differentially private data analysis versus the accuracy obtainable when privacy is relaxed to a computational variant of differential privacy. The first proof technique we develop demonstrates a connection between differential privacy and deterministic extraction from Santha-Vazirani sources. A second connection we expose indicates that the ability to approximate a function by a low-error differentially-private protocol is strongly related to the ability to approximate it by a low communication protocol. (The connection goes in both directions.)

[1]  Joan Feigenbaum,et al.  Approximate privacy: foundations and quantification (extended abstract) , 2010, EC '10.

[2]  Xi Chen,et al.  How to compress interactive communication , 2010, STOC '10.

[3]  Hartmut Klauck,et al.  The Partition Bound for Classical Communication Complexity and Query Complexity , 2009, 2010 IEEE 25th Annual Conference on Computational Complexity.

[4]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[5]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[6]  Thomas Holenstein Parallel Repetition: Simplification and the No-Signaling Case , 2006, Theory Comput..

[7]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[8]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[9]  Cynthia Dwork,et al.  Privacy-Preserving Datamining on Vertically Partitioned Databases , 2004, CRYPTO.

[10]  Yevgeniy Dodis,et al.  On Extracting Private Randomness over a Public Channel , 2003, RANDOM-APPROX.

[11]  Ziv Bar-Yossef,et al.  An information statistics approach to data stream and communication complexity , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[12]  Andrew Chi-Chih Yao,et al.  Informational complexity and the direct sum problem for simultaneous message complexity , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[13]  Rafail Ostrovsky,et al.  Replication is not needed: single database, computationally-private information retrieval , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[14]  Udi Manber,et al.  Finding Similar Files in a Large File System , 1994, USENIX Winter.

[15]  Eyal Kushilevitz,et al.  Privacy and communication complexity , 1989, 30th Annual Symposium on Foundations of Computer Science.

[16]  Eyal Kushilevitz,et al.  A zero-one law for Boolean privacy , 1989, STOC '89.

[17]  Umesh V. Vazirani,et al.  Strong communication complexity or generating quasi-random sequences from two communicating semi-random sources , 1987, Comb..

[18]  Oded Goldreich,et al.  A randomized protocol for signing contracts , 1985, CACM.

[19]  Andrew Chi-Chih Yao,et al.  Some complexity questions related to distributive computing(Preliminary Report) , 1979, STOC.

[20]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[21]  Avi Wigderson,et al.  A note on ex-tracting randomness from Santha-Vazirani sources , 2004 .

[22]  Andrew Chi-Chih Yao,et al.  Protocols for Secure Computations (Extended Abstract) , 1982, FOCS.