Privacy through Fake yet Semantically Real Traces

Camouflaging data by generating fake information is a well-known obfuscation technique for protecting data privacy. In this paper, we focus on a very sensitive and increasingly exposed type of data: location data. There are two main scenarios in which fake traces are of extreme value to preserve location privacy: publishing datasets of location trajectories, and using location-based services. Despite advances in protecting (location) data privacy, there is no quantitative method to evaluate how realistic a synthetic trace is, and how much utility and privacy it provides in each scenario. Also, the lack of a methodology to generate privacy-preserving fake traces is evident. In this paper, we fill this gap and propose the first statistical metric and model to generate fake location traces such that both the utility of data and the privacy of users are preserved. We build upon the fact that, although geographically they visit distinct locations, people have strongly semantically similar mobility patterns, for example, their transition pattern across activities (e.g., working, driving, staying at home) is similar. We define a statistical metric and propose an algorithm that automatically discovers the hidden semantic similarities between locations from a bag of real location traces as seeds, without requiring any initial semantic annotations. We guarantee that fake traces are geographically dissimilar to their seeds, so they do not leak sensitive location information. We also protect contributors to seed traces against membership attacks. Interleaving fake traces with mobile users' traces is a prominent location privacy defense mechanism. We quantitatively show the effectiveness of our methodology in protecting against localization inference attacks while preserving utility of sharing/publishing traces.

[1]  Catuscia Palamidessi,et al.  Geo-indistinguishability: differential privacy for location-based systems , 2012, CCS.

[2]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[3]  George Danezis,et al.  Quantifying Location Privacy: The Case of Sporadic Location Exposure , 2011, PETS.

[4]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[5]  Panos M. Pardalos,et al.  Quadratic assignment and related problems : DIMACS workshop, May 20-21, 1993 , 1994 .

[6]  L. Goddard Information Theory , 1962, Nature.

[7]  Panos M. Pardalos,et al.  Quadratic Assignment and Related Problems , 1994 .

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[10]  Jean-Yves Le Boudec,et al.  Quantifying Location Privacy , 2011, 2011 IEEE Symposium on Security and Privacy.

[11]  S. Brooks,et al.  Optimization Using Simulated Annealing , 1995 .

[12]  John Krumm Realistic Driving Trips For Location Privacy , 2009, Pervasive.

[13]  L. Cox Statistical Disclosure Limitation , 2006 .

[14]  D. Gática-Pérez,et al.  Towards rich mobile phone datasets: Lausanne data collection campaign , 2010 .

[15]  Oliver Berthold,et al.  Dummy Traffic against Long Term Intersection Attacks , 2002, Privacy Enhancing Technologies.

[16]  Wang-Chien Lee,et al.  Protecting Moving Trajectories with Dummies , 2007, 2007 International Conference on Mobile Data Management.

[17]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[18]  Sabrina De Capitani di Vimercati,et al.  An Obfuscation-Based Approach for Protecting Location Privacy , 2011, IEEE Transactions on Dependable and Secure Computing.

[19]  Ian R. Kerr,et al.  Lessons from the Identity Trail: Anonymity, Privacy and Identity in a Networked Society , 2009 .

[20]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[21]  Helen Nissenbaum,et al.  Trackmenot: Resisting Surveillance in Web Search , 2015 .

[22]  Ashwin Machanavajjhala,et al.  Privacy: Theory meets Practice on the Map , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[23]  Peter J. Bickel,et al.  The Earth Mover's distance is the Mallows distance: some insights from statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[24]  Hui Xiong,et al.  Preserving privacy in gps traces via uncertainty-aware path cloaking , 2007, CCS '07.

[25]  Ronald L. Rivest,et al.  Honeywords: making password-cracking detectable , 2013, CCS.

[26]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[27]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[28]  Carmela Troncoso,et al.  Protecting location privacy: optimal strategy against localization attacks , 2012, CCS.

[29]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[30]  Albert-László Barabási,et al.  Limits of Predictability in Human Mobility , 2010, Science.

[31]  Philippe Golle,et al.  Faking contextual data for fun, profit, and privacy , 2009, WPES '09.

[32]  Bart Preneel,et al.  Taxonomy of Mixes and Dummy Traffic , 2004, International Information Security Workshops.

[33]  Catuscia Palamidessi,et al.  Optimal Geo-Indistinguishable Mechanisms for Location Privacy , 2014, CCS.

[34]  Tetsuji Satoh,et al.  An anonymous communication technique using dummies for location-based services , 2005, ICPS '05. Proceedings. International Conference on Pervasive Services, 2005..

[35]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[36]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[37]  Ryen W. White,et al.  From devices to people: attribution of search activity in multi-user settings , 2014, WWW.