Differentially Private Hierarchical Count-of-Counts Histograms

We consider the problem of privately releasing a class of queries that we call hierarchical count-of-counts histograms. Count-of-counts histograms partition the rows of an input table into groups (e.g., group of people in the same household), and for every integer j report the number of groups of size j. Hierarchical count-of-counts queries report count-of-counts histograms at different granularities as per hierarchy defined on an attribute in the input data (e.g., geographical location of a household at the national, state and county levels). In this paper, we introduce this problem, along with appropriate error metrics and propose a differentially private solution that generates count-of-counts histograms that are consistent across all levels of the hierarchy.

[1]  Yin Yang,et al.  Differentially private histogram publication , 2012, The VLDB Journal.

[2]  H. D. Brunk,et al.  The Isotonic Regression Problem and its Dual , 1972 .

[3]  Chun Yuan,et al.  Differentially Private Data Release through Multidimensional Partitioning , 2010, Secure Data Management.

[4]  Divesh Srivastava,et al.  Differentially Private Spatial Decompositions , 2011, 2012 IEEE 28th International Conference on Data Engineering.

[5]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[6]  Ilya Mironov,et al.  On significance of the least significant bits for differential privacy , 2012, CCS.

[7]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[8]  Tim Robertson,et al.  On Estimating Monotone Parameters , 1968 .

[9]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Nina Mishra,et al.  Releasing search queries and clicks privately , 2009, WWW '09.

[11]  Andrew McGregor,et al.  Optimizing linear counting queries under differential privacy , 2009, PODS.

[12]  Claude Castelluccia,et al.  Differentially Private Histogram Publishing through Lossy Compression , 2012, 2012 IEEE 12th International Conference on Data Mining.

[13]  Tim Roughgarden,et al.  Universally utility-maximizing privacy mechanisms , 2008, STOC '09.

[14]  Ninghui Li,et al.  Locally Differentially Private Protocols for Frequency Estimation , 2017, USENIX Security Symposium.

[15]  Katrina Ligett,et al.  A Simple and Practical Algorithm for Differentially Private Data Release , 2010, NIPS.

[16]  Yue Wang,et al.  A Data- and Workload-Aware Algorithm for Range Queries Under Differential Privacy , 2014, ArXiv.

[17]  Sanjeev Khanna,et al.  Distributed Private Heavy Hitters , 2012, ICALP.

[18]  Raef Bassily,et al.  Practical Locally Private Heavy Hitters , 2017, NIPS.

[19]  Joseph Bonneau,et al.  Differentially Private Password Frequency Lists , 2016, NDSS.

[20]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[21]  Marianne Winslett,et al.  Differentially private data cubes: optimizing noise sources and consistency , 2011, SIGMOD '11.

[22]  Vishesh Karwa,et al.  Inference using noisy degrees: Differentially private $\beta$-model and synthetic graphs , 2012, 1205.4697.

[23]  Ashwin Machanavajjhala,et al.  Pythia: Data Dependent Differentially Private Algorithm Selection , 2017, SIGMOD Conference.

[24]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[25]  J. Kalbfleisch Statistical Inference Under Order Restrictions , 1975 .

[26]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[27]  Ninghui Li,et al.  Understanding Hierarchical Methods for Differentially Private Histograms , 2013, Proc. VLDB Endow..

[28]  Jianliang Xu,et al.  Towards Accurate Histogram Publication under Differential Privacy , 2014, SDM.

[29]  Bing-Rong Lin,et al.  Information preservation in statistical privacy and bayesian estimation of unattributed histograms , 2013, SIGMOD '13.

[30]  Raef Bassily,et al.  Local, Private, Efficient Protocols for Succinct Histograms , 2015, STOC.

[31]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[32]  Xing Xie,et al.  PrivTree: A Differentially Private Algorithm for Hierarchical Decompositions , 2016, SIGMOD Conference.