BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model

We propose a hybrid model of differential privacy that considers a combination of regular and opt-in users who desire the differential privacy guarantees of the local privacy model and the trusted curator model, respectively. We demonstrate that within this model, it is possible to design a new type of blended algorithm that improves the utility of obtained data, while providing users with their desired privacy guarantees. We apply this algorithm to the task of privately computing the head of the search log and show that the blended approach provides significant improvements in the utility of the data compared to related work. Specifically, on two large search click data sets, comprising 1.75 and 16 GB, respectively, our approach attains NDCG values exceeding 95% across a range of privacy budget values.

[1]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[2]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[3]  Alessandro Acquisti,et al.  Privacy and rationality in individual decision making , 2005, IEEE Security & Privacy.

[4]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[5]  Aristides Gionis,et al.  The impact of caching on search engines , 2007, SIGIR.

[6]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[7]  Aristides Gionis,et al.  Design trade-offs for search engine caching , 2008, TWEB.

[8]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, FOCS.

[9]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[10]  Ashwin Machanavajjhala,et al.  Privacy: Theory meets Practice on the Map , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[11]  Nina Mishra,et al.  Releasing search queries and clicks privately , 2009, WWW '09.

[12]  Rong Jin,et al.  Learning to Rank by Optimizing NDCG Measure , 2009, NIPS.

[13]  Adam D. Smith,et al.  Discovering frequent patterns in sensitive data , 2010, KDD.

[14]  Fabrizio Silvestri,et al.  Mining Query Logs: Turning Search Usage Data into Knowledge , 2010, Found. Trends Inf. Retr..

[15]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[16]  C. Dwork A firm foundation for private data analysis , 2011, Commun. ACM.

[17]  Aleksandra Korolova Privacy Violations Using Microtargeted Ads: A Case Study , 2011, J. Priv. Confidentiality.

[18]  Ashwin Machanavajjhala,et al.  Publishing Search Logs—A Comparative Study of Privacy Guarantees , 2012, IEEE Transactions on Knowledge and Data Engineering.

[19]  Sanjeev Khanna,et al.  Distributed Private Heavy Hitters , 2012, ICALP.

[20]  Ninghui Li,et al.  PrivBasis: Frequent Itemset Mining with Differential Privacy , 2012, Proc. VLDB Endow..

[21]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  A. Anonymous,et al.  Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy , 2013, J. Priv. Confidentiality.

[23]  Miguel Á. Carreira-Perpiñán,et al.  Projection onto the probability simplex: An efficient algorithm with a simple proof, and an application , 2013, ArXiv.

[24]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[25]  Andreas Haeberlen,et al.  Differential Privacy: An Economic Method for Choosing Epsilon , 2014, 2014 IEEE 27th Computer Security Foundations Symposium.

[26]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[27]  Vaidy S. Sunderam,et al.  Monitoring web browsing behavior with differential privacy , 2014, WWW.

[28]  Pramod Viswanath,et al.  Extremal Mechanisms for Local Differential Privacy , 2014, J. Mach. Learn. Res..

[29]  Raef Bassily,et al.  Local, Private, Efficient Protocols for Succinct Histograms , 2015, STOC.

[30]  G. Loewenstein,et al.  Privacy and human behavior in the age of information , 2015, Science.

[31]  Sabine Trepte,et al.  Is the privacy paradox a relic of the past? An in‐depth analysis of privacy attitudes and privacy behaviors , 2015 .

[32]  Peter Kairouz,et al.  Discrete Distribution Estimation under Local Privacy , 2016, ICML.

[33]  Yin Yang,et al.  Heavy Hitter Estimation over Set-Valued Data with Local Differential Privacy , 2016, CCS.

[34]  Úlfar Erlingsson,et al.  Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries , 2015, Proc. Priv. Enhancing Technol..

[35]  Zhiwei Xu,et al.  Privacy-Preserving Query Log Sharing Based on Prior N-Word Aggregation , 2016, 2016 IEEE Trustcom/BigDataSE/ISPA.

[36]  Jun Tang,et al.  Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12 , 2017, ArXiv.

[37]  Ronitt Rubinfeld,et al.  Differentially Private Identity and Closeness Testing of Discrete Distributions , 2017, ArXiv.

[38]  Ninghui Li,et al.  Locally Differentially Private Protocols for Frequency Estimation , 2017, USENIX Security Symposium.

[39]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[40]  Raef Bassily,et al.  Practical Locally Private Heavy Hitters , 2017, NIPS.

[41]  Vitaly Feldman,et al.  Privacy Amplification by Iteration , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[42]  Huanyu Zhang,et al.  Differentially Private Testing of Identity and Closeness of Discrete Distributions , 2017, NeurIPS.

[43]  Jinyuan Jia,et al.  Calibrate: Frequency Estimation and Heavy Hitter Identification with Local Differential Privacy via Incorporating Prior Knowledge , 2018, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.