Under the Hood of Membership Inference Attacks on Aggregate Location Time-Series

While location data is extremely valuable for various applications, disclosing it prompts serious threats to individuals' privacy. To limit such concerns, organizations often provide analysts with aggregate time-series that indicate, e.g., how many people are in a location at a time interval, rather than raw individual traces. In this paper, we perform a measurement study to understand Membership Inference Attacks (MIAs) on aggregate location time-series, where an adversary tries to infer whether a specific user contributed to the aggregates. We find that the volume of contributed data, as well as the regularity and particularity of users' mobility patterns, play a crucial role in the attack's success. We experiment with a wide range of defenses based on generalization, hiding, and perturbation, and evaluate their ability to thwart the attack vis-à-vis the utility loss they introduce for various mobility analytics tasks. Our results show that some defenses fail across the board, while others work for specific tasks on aggregate location time-series. For instance, suppressing small counts can be used for ranking hotspots, data generalization for forecasting traffic, hotspot discovery, and map inference, while sampling is effective for location labeling and anomaly detection when the dataset is sparse. Differentially private techniques provide reasonable accuracy only in very specific settings, e.g., discovering hotspots and forecasting their traffic, and more so when using weaker privacy notions like crowd-blending privacy. Overall, our measurements show that there does not exist a unique generic defense that can preserve the utility of the analytics for arbitrary applications, and provide useful insights regarding the disclosure of sanitized aggregate location time-series.

[1]  Emiliano De Cristofaro,et al.  What Does The Crowd Say About You? Evaluating Aggregation-based Location Privacy , 2017, Proc. Priv. Enhancing Technol..

[2]  Johannes Gehrke,et al.  Crowd-Blending Privacy , 2012, IACR Cryptol. ePrint Arch..

[3]  George Danezis,et al.  Quantifying Location Privacy: The Case of Sporadic Location Exposure , 2011, PETS.

[4]  Cyrus Shahabi,et al.  Differentially private publication of location entropy , 2016, SIGSPATIAL/GIS.

[5]  Ninghui Li,et al.  On sampling, anonymization, and differential privacy or, k-anonymization meets differential privacy , 2011, ASIACCS '12.

[6]  Mirco Musolesi,et al.  Spatio-temporal techniques for user identification by means of GPS mobility data , 2015, EPJ Data Science.

[7]  Hui Zang,et al.  Anonymization of location data does not work: a large-scale measurement study , 2011, MobiCom.

[8]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[9]  Jean-Yves Le Boudec,et al.  Quantifying Location Privacy , 2011, 2011 IEEE Symposium on Security and Privacy.

[10]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[11]  Cyrus Shahabi,et al.  Crowd sensing of traffic anomalies based on human mobility and social media , 2013, SIGSPATIAL/GIS.

[12]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[13]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[14]  Krzysztof Janowicz,et al.  On the semantic annotation of places in location-based social networks , 2011, KDD.

[15]  Margaret Martonosi,et al.  DP-WHERE: Differentially private modeling of human mobility , 2013, 2013 IEEE International Conference on Big Data.

[16]  Ganesh Iyer,et al.  A Usability Evaluation of Tor Launcher , 2017, Proc. Priv. Enhancing Technol..

[17]  Úlfar Erlingsson,et al.  RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response , 2014, CCS.

[18]  Suman Nath,et al.  Differentially private aggregation of distributed time-series with transformation and encryption , 2010, SIGMOD Conference.

[19]  M. Kendall The treatment of ties in ranking problems. , 1945, Biometrika.

[20]  Liam McNamara,et al.  SpotME If You Can: Randomized Responses for Location Obfuscation on Mobile Phones , 2011, 2011 31st International Conference on Distributed Computing Systems.

[21]  Claude Castelluccia,et al.  Study : Privacy Preserving Release of Spatio-temporal Density in Paris , 2014 .

[22]  Ashwin Machanavajjhala,et al.  Privacy: Theory meets Practice on the Map , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[23]  David Evans,et al.  Evaluating Differentially Private Machine Learning in Practice , 2019, USENIX Security Symposium.

[24]  Johannes Gehrke,et al.  Towards Privacy for Social Networks: A Zero-Knowledge Based Definition of Privacy , 2011, TCC.

[25]  Divesh Srivastava,et al.  DPT: Differentially Private Trajectory Synthesis Using Hierarchical Reference Systems , 2015, Proc. VLDB Endow..

[26]  Michael Hicks,et al.  Deanonymizing mobility traces: using social network as a side-channel , 2012, CCS.

[27]  Shouling Ji,et al.  General Graph Data De-Anonymization , 2016, ACM Trans. Inf. Syst. Secur..

[28]  Zhi-Li Zhang,et al.  From Fingerprint to Footprint: Revealing Physical World Privacy Leakage by Cyberspace Cookie Logs , 2017, CIKM.

[29]  Marco Gruteser,et al.  USENIX Association , 1992 .

[30]  Stefan Katzenbeisser,et al.  On (The Lack Of) Location Privacy in Crowdsourcing Applications , 2019, USENIX Security Symposium.

[31]  César A. Hidalgo,et al.  Unique in the Crowd: The privacy bounds of human mobility , 2013, Scientific Reports.

[32]  John M. Abowd,et al.  The U.S. Census Bureau Adopts Differential Privacy , 2018, KDD.

[33]  Emiliano De Cristofaro,et al.  LOGAN: Evaluating Privacy Leakage of Generative Models Using Generative Adversarial Networks , 2017, ArXiv.

[34]  Stefan Katzenbeisser,et al.  Two Is Not Enough: Privacy Assessment of Aggregation Schemes in Smart Metering , 2017, Proc. Priv. Enhancing Technol..

[35]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[36]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[37]  Vitaly Shmatikov,et al.  Inference Attacks Against Collaborative Learning , 2018, ArXiv.

[38]  Tao Wang,et al.  A Systematic Approach to Developing and Evaluating Website Fingerprinting Defenses , 2014, CCS.

[39]  John Krumm,et al.  Inference Attacks on Location Tracks , 2007, Pervasive.

[40]  Cecilia Mascolo,et al.  Geo-spotting: mining online location-based services for optimal retail store placement , 2013, KDD.

[41]  Krishna P. Gummadi,et al.  Privacy Risks with Facebook's PII-Based Targeting: Auditing a Data Broker's Advertising Interface , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[42]  Haixu Tang,et al.  Learning your identity and disease from research papers: information leaks in genome wide association study , 2009, CCS.

[43]  Reza Shokri,et al.  Synthesizing Plausible Privacy-Preserving Location Traces , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[44]  Mirco Musolesi,et al.  Trajectories of depression: unobtrusive monitoring of depressive states by means of smartphone mobility traces analysis , 2015, UbiComp.

[45]  Claudio Soriente,et al.  There goes Wally: Anonymously sharing your location gives you away , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[46]  Xuan Song,et al.  Deep ROI-Based Modeling for Urban Human Mobility Prediction , 2018, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[47]  Carmela Troncoso,et al.  Unraveling an old cloak: k-anonymity for location privacy , 2010, WPES '10.

[48]  Emiliano De Cristofaro,et al.  Privacy-friendly mobility analytics using aggregate location data , 2016, SIGSPATIAL/GIS.

[49]  Ling Liu,et al.  Towards Demystifying Membership Inference Attacks , 2018, ArXiv.

[50]  Philippe Golle,et al.  Faking contextual data for fun, profit, and privacy , 2009, WPES '09.

[51]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[52]  Philippe Golle,et al.  On the Anonymity of Home/Work Location Pairs , 2009, Pervasive.

[53]  Hui Xiong,et al.  Preserving privacy in gps traces via uncertainty-aware path cloaking , 2007, CCS '07.

[54]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[55]  Xiaoming Fu,et al.  Trajectory Recovery From Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data , 2017, WWW.

[56]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[57]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[58]  Gang Wang,et al.  De-anonymization of Mobility Trajectories: Dissecting the Gaps between Theory and Practice , 2018, NDSS.

[59]  Yin Wang,et al.  Mining large-scale, sparse GPS traces for map inference: comparison of approaches , 2012, KDD.

[60]  Ling Liu,et al.  Utility-Aware Synthesis of Differentially Private and Attack-Resilient Location Traces , 2018, CCS.

[61]  Kai Chen,et al.  Understanding Membership Inferences on Well-Generalized Learning Models , 2018, ArXiv.

[62]  W. Fung,et al.  Interpreting DNA mixtures with the presence of relatives , 2003, International Journal of Legal Medicine.

[63]  Romit Roy Choudhury,et al.  Hiding stars with fireworks: location privacy through camouflage , 2009, MobiCom '09.