Towards Statistical Queries over Distributed Private User Data

To maintain the privacy of individual users' personal data, a growing number of researchers propose storing user data in client computers or personal data stores in the cloud, and allowing users to tightly control the release of that data. While this allows specific applications to use certain approved user data, it precludes broad statistical analysis of user data. Distributed differential privacy is one approach to enabling this analysis, but previous proposals are not practical in that they scale poorly, or that they require trusted clients. This paper proposes a design that overcomes these limitations. It places tight bounds on the extent to which malicious clients can distort answers, scales well, and tolerates churn among clients. This paper presents a detailed design and analysis, and gives performance results of a complete implementation based on the deployment of over 600 clients.

[1]  Vitaly Shmatikov,et al.  Airavat: Security and Privacy for MapReduce , 2010, NSDI.

[2]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[3]  Charles V. Wright,et al.  Playing Devil's Advocate: Inferring Sensitive Information from Anonymized Network Traces , 2007, NDSS.

[4]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[5]  Silvio Micali,et al.  How to play ANY mental game , 1987, STOC.

[6]  Elaine Shi,et al.  Privacy-Preserving Aggregation of Time-Series Data , 2011, NDSS.

[7]  Saikat Guha,et al.  Privad: Practical Privacy in Online Advertising , 2011, NSDI.

[8]  Suman Nath,et al.  Privacy-aware personalization for mobile advertising , 2012, CCS.

[9]  Michael K. Reiter,et al.  Flicker: an execution infrastructure for tcb minimization , 2008, Eurosys '08.

[10]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[11]  Ronald Perez,et al.  Linking remote attestation to secure tunnel endpoints , 2006, STC '06.

[12]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[13]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[14]  Aniket Kate,et al.  ObliviAd: Provably Secure and Practical Online Behavioral Advertising , 2012, 2012 IEEE Symposium on Security and Privacy.

[15]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[16]  John R. Douceur,et al.  The Sybil Attack , 2002, IPTPS.

[17]  Massimo Barbaro,et al.  A Face Is Exposed for AOL Searcher No , 2006 .

[18]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[19]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[20]  Balachander Krishnamurthy,et al.  On the leakage of personally identifiable information via online social networks , 2010, Comput. Commun. Rev..

[21]  Andreas Haeberlen,et al.  Differential Privacy Under Fire , 2011, USENIX Security Symposium.

[22]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[23]  Nina Mishra,et al.  Releasing search queries and clicks privately , 2009, WWW '09.

[24]  Michael T. Goodrich,et al.  Invertible bloom lookup tables , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[25]  Silvio Micali,et al.  Probabilistic Encryption , 1984, J. Comput. Syst. Sci..

[26]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[27]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[28]  Jason Lee,et al.  The devil and packet trace anonymization , 2006, CCRV.

[29]  Aaron Roth,et al.  Selling privacy at auction , 2010, EC '11.

[30]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[31]  Helen Nissenbaum,et al.  Adnostic: Privacy Preserving Targeted Advertising , 2010, NDSS.

[32]  Amos Fiat,et al.  How to Prove Yourself: Practical Solutions to Identification and Signature Problems , 1986, CRYPTO.

[33]  Jonathan P. Sorenson,et al.  Efficient Algorithms for Computing the Jacobi Symbol , 1996, J. Symb. Comput..

[34]  Jeffrey Shallit,et al.  A binary algorithm for the Jacobi symbol , 1993, SIGS.

[35]  Adrian Perrig,et al.  Bootstrapping Trust in Commodity Computers , 2010, 2010 IEEE Symposium on Security and Privacy.

[36]  Suman Nath,et al.  Differentially private aggregation of distributed time-series with transformation and encryption , 2010, SIGMOD Conference.

[37]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[38]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[39]  David Eppstein,et al.  Straggler Identification in Round-Trip Data Streams via Newton's Identities and Invertible Bloom Filters , 2007, IEEE Transactions on Knowledge and Data Engineering.

[40]  Ratul Mahajan,et al.  Differentially-private network trace analysis , 2010, SIGCOMM 2010.

[41]  Donald F. Towsley,et al.  Analyzing Privacy in Enterprise Packet Trace Anonymization , 2008, NDSS.

[42]  Jacques Stern,et al.  Practical multi-candidate election system , 2001, PODC '01.

[43]  Silvio Micali,et al.  Probabilistic encryption & how to play mental poker keeping secret all partial information , 1982, STOC '82.

[44]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[45]  Aaron Roth,et al.  A learning theory approach to noninteractive database privacy , 2011, JACM.

[46]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[47]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[48]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.