Asynchronous side information attack from the edge: an approach to identify participants from anonymous mobility traces

With the increasing adoption of location-based social network applications, a large number of location traces of human mobility have been collected and published for the purpose of assisting mobile system design and scientific research. Most mobility traces are processed to achieve anonymity before publishing by the way of replacing the true IDs and introducing noise interference. In this paper, we show that such anonymous mobility traces are vulnerable to asynchronous side information attack from the edge: if partial movement information is exposed to some compromised edge nodes even after the data collection period, the adversary is able to identify the participant from the anonymous mobility traces with high probability. Our method to identify participants is based on exploring the accumulative temporal and spatial characteristics of individual movement. We introduce $$\delta$$δ-partition to divide user locations into sub-areas, and $$\epsilon$$ϵ-partition to group user activities into time intervals. We illustrate that a mobility trace can be uniquely represented by a set of frequent locations together with their active time intervals. We further derive a similarity measurement to be used by the adversary for asynchronous side information attack. We develop theoretical analysis to prove that an anonymous participant can be correctly identified with high probability under certain condition. Extensive experiments are conducted on three typical mobility datasets corresponding to the movement of bus, taxi and human, which show that the identification success ratio achieves 99%, 45% and 72%, respectively.

[1]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[2]  P. A. P. Moran,et al.  An introduction to probability theory , 1968 .

[3]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[4]  Wei Liu,et al.  CORRIGENDUM: Biodegradation-inspired bioproduction of methylacetoin and 2-methyl-2,3-butanediol , 2013, Scientific Reports.

[5]  Injong Rhee,et al.  CRAWDAD dataset ncsu/mobilitymodels (v.2009-07-23) , 2009 .

[6]  Marco Gruteser,et al.  USENIX Association , 1992 .

[7]  Lorenzo Bracciale,et al.  CRAWDAD dataset roma/taxi (v.2014-07-17) , 2014 .

[8]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[9]  David K. Y. Yau,et al.  Privacy vulnerability of published anonymous mobility traces , 2010, MobiCom.

[10]  George Danezis,et al.  GENERAL TERMS , 2003 .

[11]  Stuart J. Barnes Location-Based Services: The State of the Art , 2004 .

[12]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[13]  Ling Liu,et al.  Protecting Location Privacy with Personalized k-Anonymity: Architecture and Algorithms , 2008, IEEE Transactions on Mobile Computing.

[14]  Xing Xie,et al.  GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory , 2010, IEEE Data Eng. Bull..

[15]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[16]  Ashwin Machanavajjhala,et al.  Worst-Case Background Knowledge for Privacy-Preserving Data Publishing , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[17]  Nathan Eagle,et al.  CRAWDAD dataset mit/reality (v.2005-07-01) , 2005 .

[18]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[19]  Ling Liu,et al.  From Data Privacy to Location Privacy: Models and Algorithms , 2007, VLDB.

[20]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[21]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[22]  Mirco Musolesi,et al.  It's the way you check-in: identifying users in location-based social networks , 2014, COSN '14.

[23]  Matthias Grossglauser,et al.  CRAWDAD dataset epfl/mobility (v.2009-02-24) , 2009 .

[24]  Yu Zhang,et al.  Preserving User Location Privacy in Mobile Data Management Infrastructures , 2006, Privacy Enhancing Technologies.

[25]  César A. Hidalgo,et al.  Unique in the Crowd: The privacy bounds of human mobility , 2013, Scientific Reports.