Engineering Methods for Differentially Private Histograms: Efficiency Beyond Utility

Publishing histograms with <inline-formula><tex-math notation="LaTeX">$\epsilon$</tex-math><alternatives><inline-graphic xlink:href="kellaris-ieq1-2827378.gif"/></alternatives></inline-formula>-<italic>differential privacy</italic> has been studied extensively in the literature. Existing schemes aim at maximizing the <italic>utility</italic> of the published data, while previous experimental evaluations analyze the privacy/utility trade-off. In this paper, we provide the first experimental evaluation of differentially private methods that goes beyond utility, emphasizing also on another important aspect, namely <italic>efficiency</italic>. Towards this end, we first observe that all existing schemes are comprised of a small set of common blocks. We then optimize and choose the best implementation for each block, determine the combinations of blocks that capture the entire literature, and propose novel block combinations. We qualitatively assess the quality of the schemes based on the skyline of efficiency and utility, i.e., based on whether a method is dominated on both aspects or not. Using exhaustive experiments on four real datasets with different characteristics, we conclude that there are always trade-offs in terms of utility and efficiency. We demonstrate that the schemes derived from our novel block combinations provide the best trade-offs for time critical applications. Our work can serve as a guide to help practitioners <italic>engineer</italic> a differentially private histogram scheme depending on their application requirements.

[1]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[2]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[3]  Aaron Roth,et al.  A learning theory approach to noninteractive database privacy , 2011, JACM.

[4]  Stavros Papadopoulos,et al.  Practical Differential Privacy via Grouping and Smoothing , 2013, Proc. VLDB Endow..

[5]  Suman Nath,et al.  Differentially private aggregation of distributed time-series with transformation and encryption , 2010, SIGMOD Conference.

[6]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[7]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[8]  References , 1971 .

[9]  Ninghui Li,et al.  Understanding Hierarchical Methods for Differentially Private Histograms , 2013, Proc. VLDB Endow..

[10]  Yin Yang,et al.  Differentially Private Histogram Publication , 2012, ICDE.

[11]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[12]  Yue Wang,et al.  A Data- and Workload-Aware Query Answering Algorithm for Range Queries Under Differential Privacy , 2014, Proc. VLDB Endow..

[13]  Amos Beimel,et al.  Private Learning and Sanitization: Pure vs. Approximate Differential Privacy , 2013, APPROX-RANDOM.

[14]  Nimrod Megiddo,et al.  Range queries in OLAP data cubes , 1997, SIGMOD '97.

[15]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[16]  Ashwin Machanavajjhala,et al.  Principled Evaluation of Differentially Private Algorithms using DPBench , 2015, SIGMOD Conference.

[17]  Johannes Gehrke,et al.  iReduct: differential privacy with reduced relative errors , 2011, SIGMOD '11.

[18]  Jianliang Xu,et al.  Towards Accurate Histogram Publication under Differential Privacy , 2014, SDM.

[19]  Andrew McGregor,et al.  Optimizing linear counting queries under differential privacy , 2009, PODS.

[20]  Torsten Suel,et al.  Optimal Histograms with Quality Guarantees , 1998, VLDB.

[21]  Katrina Ligett,et al.  A Simple and Practical Algorithm for Differentially Private Data Release , 2010, NIPS.

[22]  Jon M. Kleinberg,et al.  Overview of the 2003 KDD Cup , 2003, SKDD.

[23]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[24]  Kobbi Nissim,et al.  Differentially Private Release and Learning of Threshold Functions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[25]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[26]  Divesh Srivastava,et al.  Differentially Private Spatial Decompositions , 2011, 2012 IEEE 28th International Conference on Data Engineering.

[27]  Claude Castelluccia,et al.  Differentially Private Histogram Publishing through Lossy Compression , 2012, 2012 IEEE 12th International Conference on Data Mining.

[28]  Yin Yang,et al.  Low-Rank Mechanism: Optimizing Batch Queries under Differential Privacy , 2012, Proc. VLDB Endow..