Distributed Differential Privacy via Shuffling

We consider the problem of designing scalable, robust protocols for computing statistics about sensitive data. Specifically, we look at how best to design differentially private protocols in a distributed setting, where each user holds a private datum. The literature has mostly considered two models: the "central" model, in which a trusted server collects users' data in the clear, which allows greater accuracy; and the "local" model, in which users individually randomize their data, and need not trust the server, but accuracy is limited. Attempts to achieve the accuracy of the central model without a trusted server have so far focused on variants of cryptographic MPC, which limits scalability. In this paper, we initiate the analytic study of a shuffled model for distributed differentially private algorithms, which lies between the local and central models. This simple-to-implement model, a special case of the ESA framework of [Bittau et al., '17], augments the local model with an anonymous channel that randomly permutes a set of user-supplied messages. For sum queries, we show that this model provides the power of the central model while avoiding the need to trust a central server and the complexity of cryptographic secure function evaluation. More generally, we give evidence that the power of the shuffled model lies strictly between those of the central and local models: for a natural restriction of the model, we show that shuffled protocols for a widely studied selection problem require exponentially higher sample complexity than do central-model protocols.

[1]  Nickolai Zeldovich,et al.  Stadium: A Distributed Metadata-Private Messaging System , 2017, IACR Cryptol. ePrint Arch..

[2]  Jonathan Ullman,et al.  Tight Lower Bounds for Locally Differentially Private Selection , 2018, ArXiv.

[3]  Michael Kearns,et al.  Efficient noise-tolerant learning from statistical queries , 1993, STOC.

[4]  Daniel A. Spielman,et al.  Spectral Graph Theory and its Applications , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[5]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[6]  Thomas Steinke,et al.  Tight Lower Bounds for Differentially Private Selection , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[7]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[8]  Srinivas Devadas,et al.  Riffle: An Efficient Communication System With Strong Anonymity , 2016, Proc. Priv. Enhancing Technol..

[9]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[10]  Salil P. Vadhan,et al.  The Complexity of Differential Privacy , 2017, Tutorials on the Foundations of Cryptography.

[11]  Shiva Prasad Kasiviswanathan,et al.  On the 'Semantics' of Differential Privacy: A Bayesian Formulation , 2008, J. Priv. Confidentiality.

[12]  Dan Boneh,et al.  Prio: Private, Robust, and Scalable Computation of Aggregate Statistics , 2017, NSDI.

[13]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[14]  David Chaum,et al.  Untraceable electronic mail, return addresses, and digital pseudonyms , 1981, CACM.

[15]  Jonathan Ullman,et al.  The Price of Selection in Differential Privacy , 2017, COLT.

[16]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[17]  Elaine Shi,et al.  Privacy-Preserving Aggregation of Time-Series Data , 2011, NDSS.

[18]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[19]  John M. Abowd,et al.  The U.S. Census Bureau Adopts Differential Privacy , 2018, KDD.

[20]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[21]  Eran Omri,et al.  Distributed Private Data Analysis: On Simultaneously Solving How and What , 2008, CRYPTO.

[22]  Úlfar Erlingsson,et al.  Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity , 2018, SODA.

[23]  Uri Stemmer,et al.  Heavy Hitters and the Structure of Local Privacy , 2017, PODS.

[24]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[25]  Elaine Shi,et al.  Optimal Lower Bound for Differentially Private Multi-party Aggregation , 2012, ESA.

[26]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[27]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[28]  Elaine Shi,et al.  Privacy-Preserving Stream Aggregation with Fault Tolerance , 2012, Financial Cryptography.

[29]  Raef Bassily,et al.  Local, Private, Efficient Protocols for Succinct Histograms , 2015, STOC.

[30]  Úlfar Erlingsson,et al.  Prochlo: Strong Privacy for Analytics in the Crowd , 2017, SOSP.

[31]  Nickolai Zeldovich,et al.  Vuvuzela: scalable private messaging resistant to traffic analysis , 2015, SOSP.