Differentially Private Publication of Location Entropy ( Technical Report )

Location entropy (LE) is an eminent metric for measuring the popularity of various locations (e.g., points-of-interest). It is used in numerous applications in geo-marketing, crime analysis, epidemiology, traffic incident analysis, spatial crowdsourcing, and geosocial networks. Unlike other metrics computed from only the number of (unique) visits to a location, namely frequency, LE also captures the diversity of the users’ visits, and is thus more accurate than other metrics. Current solutions for computing LE require full access to the past visits of users to locations, which poses privacy threats. This paper discusses, for the first time, the problem of perturbing location entropy for a set of locations according to differential privacy. The problem is challenging, inasmuch as removing a single user from the dataset will impact multiple records of the database; i.e., all the visits made by that user to various locations. Towards this end, we first derive non-trivial, tight bounds for both local and global sensitivity of LE, and show that to satisfy -differential privacy, a large amount of noise must be introduced, rendering the published results useless. Hence, we propose a thresholding technique to limit the number of users’ visits, which significantly reduces the perturbation error but introduces an approximation error. To achieve better utility, we extend the technique by adopting two weaker notions of privacy: smooth sensitivity (slightly weaker) and crowd-blending (strictly weaker). Extensive experiments on synthetic and real-world datasets show that our proposed techniques preserve original data distribution without compromising location privacy.

[1]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[2]  Catuscia Palamidessi,et al.  Geo-indistinguishability: differential privacy for location-based systems , 2012, CCS.

[3]  Aniket Kittur,et al.  Bridging the gap between physical location and online social networks , 2010, UbiComp.

[4]  David D. Jensen,et al.  Accurate Estimation of the Degree Distribution of Private Networks , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[5]  Kai Zheng,et al.  Calibrating trajectory data for similarity-based analysis , 2013, SIGMOD '13.

[6]  Romit Roy Choudhury,et al.  Hiding stars with fireworks: location privacy through camouflage , 2009, MobiCom '09.

[7]  Kenneth Wai-Ting Leung,et al.  Personalized Web search with location preferences , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[8]  Vaidy S. Sunderam,et al.  Spatial Task Assignment for Crowd Sensing with Cloaked Locations , 2014, 2014 IEEE 15th International Conference on Mobile Data Management.

[9]  Johannes Gehrke,et al.  Crowd-Blending Privacy , 2012, IACR Cryptol. ePrint Arch..

[10]  Stavros Papadopoulos,et al.  Differentially Private Event Sequences over Infinite Streams , 2014, Proc. VLDB Endow..

[11]  Lorrie Faith Cranor,et al.  Empirical models of privacy in location sharing , 2010, UbiComp.

[12]  Li Xiong,et al.  Protecting Locations with Differential Privacy under Temporal Correlations , 2014, CCS.

[13]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[14]  Keiji Yanai,et al.  A visual analysis of the relationship between word concepts and geographical locations , 2009, CIVR '09.

[15]  Cyrus Shahabi,et al.  Real-time task assignment in hyperlocal spatial crowdsourcing under budget constraints , 2016, 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[16]  Yan Liu,et al.  Inferring Social Strength from Spatiotemporal Data , 2016, TODS.

[17]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[18]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[19]  Cyrus Shahabi,et al.  A Server-Assigned Spatial Crowdsourcing Framework , 2015, ACM Trans. Spatial Algorithms Syst..

[20]  Cyrus Shahabi,et al.  Spatial influence - measuring followship in the real world , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[21]  Ying Cai,et al.  Feeling-based location privacy protection for location-based services , 2009, CCS.

[22]  Jun Zhang,et al.  PrivBayes: private data release via bayesian networks , 2014, SIGMOD Conference.

[23]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[24]  Cyrus Shahabi,et al.  Blind Evaluation of Nearest Neighbor Queries Using Space Transformation to Preserve Location Privacy , 2007, SSTD.

[25]  Ting Yu,et al.  Publishing Attributed Social Graphs with Formal Privacy Guarantees , 2016, SIGMOD Conference.

[26]  Panos Kalnis,et al.  Private queries in location based services: anonymizers are not necessary , 2008, SIGMOD Conference.

[27]  Marco Gruteser,et al.  USENIX Association , 1992 .

[28]  Pierangela Samarati,et al.  Location privacy in pervasive computing , 2008 .

[29]  Cyrus Shahabi,et al.  Differentially Private H-Tree , 2015, GeoPrivacy@SIGSPATIAL.

[30]  Divesh Srivastava,et al.  Differentially Private Spatial Decompositions , 2011, 2012 IEEE 28th International Conference on Data Engineering.

[31]  César A. Hidalgo,et al.  Unique in the Crowd: The privacy bounds of human mobility , 2013, Scientific Reports.

[32]  Cyrus Shahabi,et al.  GeoCrowd: enabling query answering with spatial crowdsourcing , 2012, SIGSPATIAL/GIS.

[33]  Suman Nath,et al.  Differentially private aggregation of distributed time-series with transformation and encryption , 2010, SIGMOD Conference.

[34]  Divesh Srivastava,et al.  DPT: Differentially Private Trajectory Synthesis Using Hierarchical Reference Systems , 2015, Proc. VLDB Endow..

[35]  Ninghui Li,et al.  Understanding Hierarchical Methods for Differentially Private Histograms , 2013, Proc. VLDB Endow..

[36]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[37]  G. Pottie,et al.  Entropy-based sensor selection heuristic for target localization , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.

[38]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[39]  H. Van Dyke Parunak,et al.  Entropy and self-organization in multi-agent systems , 2001, AGENTS '01.

[40]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[41]  Minho Shin,et al.  Anonysense: privacy-aware people-centric sensing , 2008, MobiSys '08.

[42]  Salil S. Kanhere,et al.  A survey on privacy in mobile participatory sensing applications , 2011, J. Syst. Softw..

[43]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[44]  Walid G. Aref,et al.  Casper*: Query processing for location services without compromising privacy , 2006, TODS.

[45]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[46]  Li Xiong,et al.  Real-time aggregate monitoring with differential privacy , 2012, CIKM.

[47]  Nina Mishra,et al.  Releasing search queries and clicks privately , 2009, WWW '09.

[48]  Cyrus Shahabi,et al.  A Framework for Protecting Worker Location Privacy in Spatial Crowdsourcing , 2014, Proc. VLDB Endow..

[49]  Ninghui Li,et al.  Publishing Graph Degree Distribution with Node Differential Privacy , 2016, SIGMOD Conference.

[50]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..