Differential Privacy of Hierarchical Census Data: An Optimization Approach

This paper is motivated by applications of a Census Bureau interested in releasing aggregate socio-economic data about a large population without revealing sensitive information. The released information can be the number of individuals living alone, the number of cars they own, or their salary brackets. Recent events have identified some of the privacy challenges faced by these organizations. To address them, this paper presents a novel differential-privacy mechanism for releasing hierarchical counts of individuals satisfying a given property. The counts are reported at multiple granularities (e.g., the national, state, and county levels) and must be consistent across levels. The core of the mechanism is an optimization model that redistributes the noise introduced to attain privacy in order to meet the consistency constraints between the hierarchical levels. The key technical contribution of the paper shows that this optimization problem can be solved in polynomial time by exploiting the structure of its cost functions. Experimental results on very large, real datasets show that the proposed mechanism provides improvements up to two orders of magnitude in terms of computational efficiency and accuracy with respect to other state-of-the-art techniques.

[1]  Learning with Privacy at Scale Differential , 2017 .

[2]  Pascal Van Hentenryck,et al.  Constrained-Based Differential Privacy: Releasing Optimal Power Flow Benchmarks Privately - Releasing Optimal Power Flow Benchmarks Privately , 2018, CPAIOR.

[3]  Ashwin Machanavajjhala,et al.  Differentially Private Hierarchical Group Size Estimation , 2018, ArXiv.

[4]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[5]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[6]  Ashwin Machanavajjhala,et al.  Differentially Private Hierarchical Count-of-Counts Histograms , 2018, Proc. VLDB Endow..

[7]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[8]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[9]  Enrico Pontelli,et al.  Accelerating exact and approximate inference for (distributed) discrete optimization with GPUs , 2016, Constraints.

[10]  Ninghui Li,et al.  Understanding Hierarchical Methods for Differentially Private Histograms , 2013, Proc. VLDB Endow..

[11]  Chun Yuan,et al.  Differentially Private Data Release through Multidimensional Partitioning , 2010, Secure Data Management.

[12]  Philippe Golle,et al.  Revisiting the uniqueness of simple demographics in the US population , 2006, WPES '06.

[13]  Xing Xie,et al.  PrivTree: A Differentially Private Algorithm for Hierarchical Decompositions , 2016, SIGMOD Conference.

[14]  Yue Wang,et al.  A Data- and Workload-Aware Algorithm for Range Queries Under Differential Privacy , 2014, ArXiv.

[15]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[16]  Joseph Bonneau,et al.  Differentially Private Password Frequency Lists , 2016, NDSS.

[17]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[18]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[19]  William E. Winkler,et al.  Single-Ranking Micro-aggregation and Re-identification , 2002 .

[20]  Tim Roughgarden,et al.  Universally utility-maximizing privacy mechanisms , 2008, STOC '09.

[21]  Pascal Van Hentenryck,et al.  Constrained-Based Differential Privacy for Mobility Services , 2018, AAMAS.

[22]  Andrew McGregor,et al.  Optimizing linear counting queries under differential privacy , 2009, PODS.

[23]  Divesh Srivastava,et al.  Differentially Private Spatial Decompositions , 2011, 2012 IEEE 28th International Conference on Data Engineering.

[24]  Philip S. Yu,et al.  Orthogonal mechanism for answering batch queries with differential privacy , 2015, SSDBM.

[25]  Ninghui Li,et al.  On the tradeoff between privacy and utility in data publishing , 2009, KDD.

[26]  John M. Abowd,et al.  The U.S. Census Bureau Adopts Differential Privacy , 2018, KDD.