HTF: Homogeneous Tree Framework for Differentially-Private Release of Location Data

Mobile apps that use location data are pervasive, spanning domains such as transportation, urban planning and healthcare. Important use cases for location data rely on statistical queries, e.g., identifying hotspots where users work and travel. Such queries can be answered efficiently by building histograms. However, precise histograms can expose sensitive details about individual users. Differential privacy (DP) is a mature and widely-adopted protection model, but most approaches for DP-compliant histograms work in a data-independent fashion, leading to poor accuracy. The few proposed data-dependent techniques attempt to adjust histogram partitions based on dataset characteristics, but they do not perform well due to the addition of noise required to achieve DP. We identify density homogeneity as a main factor driving the accuracy of DP-compliant histograms, and we build a data structure that splits the space such that data density is homogeneous within each resulting partition. We show through extensive experiments on large-scale real-world data that the proposed approach achieves superior accuracy compared to existing approaches.

[1]  Ashwin Machanavajjhala,et al.  Principled Evaluation of Differentially Private Algorithms using DPBench , 2015, SIGMOD Conference.

[2]  Marco Gruteser,et al.  USENIX Association , 1992 .

[3]  Yue Wang,et al.  A Data- and Workload-Aware Algorithm for Range Queries Under Differential Privacy , 2014, ArXiv.

[4]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[5]  Haoran Li,et al.  DPCube: Differentially Private Histogram Release through Multidimensional Partitioning , 2012, Trans. Data Priv..

[6]  Catuscia Palamidessi,et al.  Geo-indistinguishability: differential privacy for location-based systems , 2012, CCS.

[7]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[8]  Ashwin Machanavajjhala,et al.  Optimizing error of high-dimensional statistical queries under differential privacy , 2018, Proc. VLDB Endow..

[9]  Divesh Srivastava,et al.  Differentially Private Spatial Decompositions , 2011, 2012 IEEE 28th International Conference on Data Engineering.

[10]  Johannes Gehrke,et al.  Differential privacy via wavelet transforms , 2009, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[11]  Claude Castelluccia,et al.  Differentially Private Histogram Publishing through Lossy Compression , 2012, 2012 IEEE 12th International Conference on Data Mining.

[12]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[13]  Reza Shokri,et al.  Synthesizing Plausible Privacy-Preserving Location Traces , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[14]  Gerome Miklau,et al.  An Adaptive Mechanism for Accurate Query Answering under Differential Privacy , 2012, Proc. VLDB Endow..

[15]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[16]  Claude Castelluccia,et al.  Differentially private sequential data publication via variable-length n-grams , 2012, CCS.

[17]  Gerome Miklau,et al.  Optimal error of query sets under the differentially-private matrix mechanism , 2012, ICDT '13.

[18]  Liam McNamara,et al.  SpotME If You Can: Randomized Responses for Location Obfuscation on Mobile Phones , 2011, 2011 31st International Conference on Distributed Computing Systems.

[19]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[20]  Panos Kalnis,et al.  A reciprocal framework for spatial K-anonymity , 2010, Inf. Syst..

[21]  Xing Xie,et al.  PrivTree: A Differentially Private Algorithm for Hierarchical Decompositions , 2016, SIGMOD Conference.

[22]  Elisa Bertino,et al.  Private record matching using differential privacy , 2010, EDBT '10.

[23]  Chun Yuan,et al.  Differentially Private Data Release through Multidimensional Partitioning , 2010, Secure Data Management.

[24]  Benjamin C. M. Fung,et al.  Differentially private transit data publication: a case study on the montreal transportation system , 2012, KDD.

[25]  Ninghui Li,et al.  Differentially private grids for geospatial data , 2012, 2013 IEEE 29th International Conference on Data Engineering (ICDE).