On the Inference of User Paths from Anonymized Mobility Data

Using the plethora of apps on smartphones andtablets entails giving them access to different types of privacysensitive information, including the device's location. This canpotentially compromise user privacy when app providers shareuser data with third parties (e.g., advertisers) for monetizationpurposes. In this paper, we focus on the interface for datasharing between app providers and third parties, and devisean attack that can break the strongest form of the commonlyused anonymization method for protecting the privacy of users. More specifically, we develop a mechanism called Comberthat given completely anonymized mobility data (without anypseudonyms) as input is able to identify different users andtheir respective paths in the data. Comber exploits the observationthat the distribution of speeds is typically similar amongdifferent users and incorporates a generic, empirically derivedhistogram of user speeds to identify the users and disentangletheir paths. Comber also benefits from two optimizations thatallow it to reduce the path inference time for large datasets. Weuse two real datasets with mobile user location traces (MobileData Challenge and GeoLife) for evaluating the effectivenessof Comber and show that it can infer paths with greater than 90% accuracy with both these datasets.

[1]  Mirco Musolesi,et al.  Privacy and the City: User Identification and Location Semantics in Location-Based Social Networks , 2015, ICWSM.

[2]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[3]  Jean-Yves Le Boudec,et al.  Quantifying Location Privacy , 2011, 2011 IEEE Symposium on Security and Privacy.

[4]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[5]  Ian Witten,et al.  Data Mining , 2000 .

[6]  Pierangela Samarati,et al.  Location privacy in pervasive computing , 2008 .

[7]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[8]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[9]  Edward W. Felten,et al.  A Precautionary Approach to Big Data Privacy , 2016 .

[10]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[12]  Mitsuaki Akiyama,et al.  RouteDetector: Sensor-based Positioning System That Exploits Spatio-Temporal Regularity of Human Mobility , 2015, WOOT.

[13]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[14]  Jeffrey S. Foster,et al.  An Empirical Study of Location Truncation on Android , 2013 .

[15]  Sabrina De Capitani di Vimercati,et al.  An Obfuscation-Based Approach for Protecting Location Privacy , 2011, IEEE Transactions on Dependable and Secure Computing.

[16]  Kang G. Shin,et al.  Location Privacy Protection for Smartphone Users , 2014, CCS.

[17]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[18]  Catuscia Palamidessi,et al.  Optimal Geo-Indistinguishable Mechanisms for Location Privacy , 2014, CCS.

[19]  John Krumm,et al.  Inference Attacks on Location Tracks , 2007, Pervasive.

[20]  Carmela Troncoso,et al.  Protecting location privacy: optimal strategy against localization attacks , 2012, CCS.

[21]  Mark Ryan,et al.  Privacy through Pseudonymity in Mobile Telephony Systems , 2014, NDSS.

[22]  D. Gática-Pérez,et al.  Towards rich mobile phone datasets: Lausanne data collection campaign , 2010 .

[23]  Sébastien Gambs,et al.  De-anonymization attack on geolocated data , 2014, J. Comput. Syst. Sci..

[24]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[25]  Marco Gruteser,et al.  On the Anonymity of Periodic Location Samples , 2005, SPC.

[26]  Sébastien Gambs,et al.  De-anonymization Attack on Geolocated Data , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[27]  Mirco Musolesi,et al.  Spatio-temporal techniques for user identification by means of GPS mobility data , 2015, EPJ Data Science.

[28]  Stéphane Bressan,et al.  Not So Unique in the Crowd: a Simple and Effective Algorithm for Anonymizing Location Data , 2014, PIR@SIGIR.

[29]  Josep Domingo-Ferrer,et al.  Location Privacy in Location-Based Services: Beyond TTP-based Schemes , 2008, PiLBA.

[30]  David K. Y. Yau,et al.  Privacy vulnerability of published anonymous mobility traces , 2010, MobiCom.

[31]  César A. Hidalgo,et al.  Unique in the Crowd: The privacy bounds of human mobility , 2013, Scientific Reports.

[32]  Philippe Golle,et al.  On the Anonymity of Home/Work Location Pairs , 2009, Pervasive.

[33]  Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security , 2014, CCS.

[34]  Reza Shokri,et al.  Quantifying the Effect of Co-location Information on Location Privacy , 2014, Privacy Enhancing Technologies.