Differentially Private Wireless Data Publication in Large-Scale WLAN Networks

Wireless trace data play an important role in wireless network researches. However, publishing the raw WLAN traces poses potential privacy risks of network users. Therefore, it is necessary to sanitize users' sensitive information before these traces are published, and provide high data utility for wireless network researches as well. Although some existing works based on various anonymization methods have started to address the problem of sanitizing WLAN traces, the anonymization techniques cannot provide strong and provable privacy guarantees. Differential Privacy is the only framework that can provide strong and provable privacy guarantees. However, we find that existing studies on differential privacy fail to provide effective data utility on multi-dimensional and large-scale datasets. Aim at WLAN trace datasets that have unique characteristics of multi-dimensional and large-scale, this paper proposes a privacy-preserving data publishing algorithm which not only satisfies differential privacy but also realizes high data utility. Furthermore, the theoretical analysis shows the noise variance of our sanitization algorithm is O(logo(1) n/ϵ2) which indicates the algorithm can achieve a higher data utility on large-scale datasets. Moreover, from the results of extensive experiments on an large-scale WLAN trace dataset, we also show that our sanitization algorithm can provide high data utility.

[1]  Guanhua Yan,et al.  Privacy analysis of user association logs in a large-scale wireless LAN , 2011, 2011 Proceedings IEEE INFOCOM.

[2]  A. Terzis,et al.  On the detection and origin identification of mobile worms , 2007, WORM '07.

[3]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[4]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[5]  Philip S. Yu,et al.  Differentially private data release for data mining , 2011, KDD.

[6]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[7]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[8]  Hans-Peter Kriegel,et al.  The DC-tree: a fully dynamic index structure for data warehouses , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[9]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Nick Roussopoulos,et al.  Cubetree: organization of and bulk incremental updates on the data cube , 1997, SIGMOD '97.

[11]  Raymond Chi-Wing Wong,et al.  Anonymization-based attacks in privacy-preserving data publishing , 2009, TODS.

[12]  Chi-Yin Chow,et al.  Trajectory privacy in location-based services and data publication , 2011, SKDD.

[13]  Josep Domingo-Ferrer,et al.  Practical Data-Oriented Microaggregation for Statistical Disclosure Control , 2002, IEEE Trans. Knowl. Data Eng..

[14]  Daniel A. Spielman,et al.  Spectral Graph Theory and its Applications , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[15]  Ahmed Helmy,et al.  Structural Analysis of User Association Patterns in University Campus Wireless LANs , 2012, IEEE Transactions on Mobile Computing.

[16]  Claude Castelluccia,et al.  Differentially private sequential data publication via variable-length n-grams , 2012, CCS.

[17]  Ahmed Helmy,et al.  Human Behavior and Challenges of Anonymizing WLAN Traces , 2009, GLOBECOM 2009 - 2009 IEEE Global Telecommunications Conference.

[18]  Kyuseok Shim,et al.  Approximate query processing using wavelets , 2001, The VLDB Journal.

[19]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[20]  Célio Vinicius N. de Albuquerque,et al.  NECTAR: a DTN routing protocol based on neighborhood contact history , 2009, SAC '09.

[21]  Yin Yang,et al.  Differentially Private Histogram Publication , 2012, ICDE.

[22]  Benjamin C. M. Fung,et al.  Publishing set-valued data via differential privacy , 2011, Proc. VLDB Endow..

[23]  Charles V. Wright,et al.  Playing Devil's Advocate: Inferring Sensitive Information from Anonymized Network Traces , 2007, NDSS.

[24]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[25]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[26]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[27]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[28]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..