Privacy of Dependent Users Against Statistical Matching

Modern applications significantly enhance user experience by adapting to each user’s individual condition and/or preferences. While this adaptation can greatly improve a user’s experience or be essential for the application to work, the exposure of user data to the application presents a significant privacy threat to the users—even when the traces are anonymized—since the statistical matching of an anonymized trace to prior user behavior can identify a user and their habits. Because of the current and growing algorithmic and computational capabilities of adversaries, provable privacy guarantees as a function of the degree of anonymization and obfuscation of the traces are necessary. Our previous work has established the requirements on anonymization and obfuscation in the case that data traces are independent between users. However, the data traces of different users will be dependent in many applications, and an adversary can potentially exploit such. In this paper, we consider the negative impact of dependency between user traces on their privacy. First, we demonstrate that the adversary can readily identify the association graph of the obfuscated and anonymized version of the data, revealing which user data traces are dependent. Next, we demonstrate that the adversary can use this association graph to break user privacy with significantly shorter traces than in the case of independent users, and that obfuscating data traces independently across users is often insufficient to remedy such leakage. In other words, we have shown that inter-user dependency is disastrous to privacy, and any non-negligible dependency between users significantly reduces the effectiveness of anonymization and obfuscation schemes. Finally, we discuss how users can improve privacy by employing joint obfuscation that removes or reduces the data dependency.

[1]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[2]  Paul Erdös,et al.  Random Graph Isomorphism , 1980, SIAM J. Comput..

[3]  David G. Kirkpatrick,et al.  A Theoretical Analysis of Various Heuristics for the Graph Isomorphism Problem , 1980, SIAM J. Comput..

[4]  Béla Bollobás,et al.  Random Graphs , 1985 .

[5]  Shlomo Shamai,et al.  Spectral Efficiency of CDMA with Random Spreading , 1999, IEEE Trans. Inf. Theory.

[6]  Panganamala Ramana Kumar,et al.  RHEINISCH-WESTFÄLISCHE TECHNISCHE HOCHSCHULE AACHEN , 2001 .

[7]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Marco Gruteser,et al.  USENIX Association , 1992 .

[9]  Sergio Verdú,et al.  Randomly spread CDMA: asymptotics via statistical physics , 2005, IEEE Transactions on Information Theory.

[10]  Marco Gruteser,et al.  Protecting Location Privacy Through Path Confusion , 2005, First International Conference on Security and Privacy for Emerging Areas in Communications Networks (SECURECOMM'05).

[11]  Maxim Raya,et al.  Mix-Zones for Location Privacy in Vehicular Networks , 2007 .

[12]  Michael L. Honig,et al.  Co-channel interference mitigation in multiuser systems with unknown channels , 2008 .

[13]  Gopal Pandurangan,et al.  Improved Random Graph Isomorphism Tomek Czajka , 2006 .

[14]  Ashwin Machanavajjhala,et al.  Privacy: Theory meets Practice on the Map , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[15]  Wenliang Du,et al.  OptRR: Optimizing Randomized Response Schemes for Privacy-Preserving Data Mining , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[16]  Bernard C. Levy,et al.  Principles of Signal Detection and Parameter Estimation , 2008 .

[17]  Frank Kargl,et al.  A location privacy metric for V2X communication systems , 2009, 2009 IEEE Sarnoff Symposium.

[18]  Reza Shokri,et al.  On the Optimal Placement of Mix Zones , 2009, Privacy Enhancing Technologies.

[19]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[20]  George Danezis,et al.  Quantifying Location Privacy: The Case of Sporadic Location Exposure , 2011, PETS.

[21]  Alex Thomo,et al.  Differential Privacy in Practice , 2012, Secure Data Management.

[22]  Chris Clifton,et al.  Differential identifiability , 2012, KDD.

[23]  Carmela Troncoso,et al.  Protecting location privacy: optimal strategy against localization attacks , 2012, CCS.

[24]  Matthias Grossglauser,et al.  On the performance of percolation graph matching , 2013, COSN '13.

[25]  Matthias Grossglauser,et al.  A Bayesian method for matching two similar graphs without seeds , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[26]  Xu Chen,et al.  Many-access channels: The Gaussian case with random user activities , 2014, 2014 IEEE International Symposium on Information Theory.

[27]  Shouling Ji,et al.  Structural Data De-anonymization: Quantification, Practice, and Implications , 2014, CCS.

[28]  Catuscia Palamidessi,et al.  Optimal Geo-Indistinguishable Mechanisms for Location Privacy , 2014, CCS.

[29]  Yong-Yeol Ahn,et al.  Community-Enhanced De-anonymization of Online Social Networks , 2014, CCS.

[30]  Jun Zhang,et al.  PrivBayes: private data release via bayesian networks , 2014, SIGMOD Conference.

[31]  Soma Bandyopadhyay,et al.  IoT-Privacy: To be private or not to be private , 2014, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[32]  Ashwin Machanavajjhala,et al.  Pufferfish , 2014, ACM Trans. Database Syst..

[33]  Li Xiong,et al.  Protecting Locations with Differential Privacy under Temporal Correlations , 2014, CCS.

[34]  Hiroshi Nakagawa,et al.  Bayesian Differential Privacy on Correlated Data , 2015, SIGMOD Conference.

[35]  S. Hyrynsalmi,et al.  Security in the Internet of Things through obfuscation and diversification , 2015, 2015 International Conference on Computing, Communication and Security (ICCCS).

[36]  Matthias Grossglauser,et al.  When can two unlabeled networks be aligned under partial overlap? , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[37]  M. Lelarge,et al.  Reconstruction in the Labelled Stochastic Block Model , 2015, IEEE Transactions on Network Science and Engineering.

[38]  Tianqing Zhu,et al.  Correlated Differential Privacy: Hiding Information in Non-IID Data Set , 2015, IEEE Transactions on Information Forensics and Security.

[39]  Mohamed Jamal Zemerly,et al.  Security and privacy framework for ubiquitous healthcare IoT devices , 2015, 2015 10th International Conference for Internet Technology and Secured Transactions (ICITST).

[40]  Jayakrishnan Unnikrishnan,et al.  Asymptotically Optimal Matching of Multiple Sequences to Source Distributions and Training Sequences , 2014, IEEE Transactions on Information Theory.

[41]  Catuscia Palamidessi,et al.  Geo-indistinguishability: A Principled Approach to Location Privacy , 2015, ICDCIT.

[42]  Varun Jog,et al.  Recovering communities in weighted stochastic block models , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[43]  Roksana Boreli,et al.  Network-level security and privacy control for smart-home IoT devices , 2015, 2015 IEEE 11th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob).

[44]  Soma Bandyopadhyay,et al.  Privacy for IoT: Involuntary privacy enablement for smart energy systems , 2015, 2015 IEEE International Conference on Communications (ICC).

[45]  Matthias Grossglauser,et al.  Growing a Graph Matching from a Handful of Seeds , 2015, Proc. VLDB Endow..

[46]  Yunhao Liu,et al.  PLP: Protecting Location Privacy Against Correlation-Analysis Attack in Crowdsensing , 2015, 2015 44th International Conference on Parallel Processing.

[47]  Ahmad-Reza Sadeghi,et al.  Security and privacy challenges in industrial Internet of Things , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[48]  Emmanuel Abbe,et al.  Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[49]  Prateek Mittal,et al.  On the Simultaneous Preservation of Privacy and Community Structure in Anonymized Networks , 2016, ArXiv.

[50]  Robin Kravets,et al.  Security and Privacy in Public IoT Spaces , 2016, 2016 25th International Conference on Computer Communication and Networks (ICCCN).

[51]  Martin Vetterli,et al.  Where You Are Is Who You Are: User Identification by Matching Statistics , 2015, IEEE Transactions on Information Forensics and Security.

[52]  Daniel Cullina,et al.  Improved Achievability and Converse Bounds for Erdos-Renyi Graph Matching , 2016, SIGMETRICS.

[53]  Cong Wang,et al.  Communities Detection Algorithm Based on General Stochastic Block Model in Mobile Social Networks , 2016, 2016 International Conference on Advanced Cloud and Big Data (CBD).

[54]  Elza Erkip,et al.  Optimal de-anonymization in random graphs with community structure , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[55]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[56]  Ehsan Kazemi,et al.  Network Alignment: Theory, Algorithms, and Applications , 2016 .

[57]  Hao Chen,et al.  Multi-User Location Correlation Protection with Differential Privacy , 2016, 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS).

[58]  Volkan Cevher,et al.  Partial recovery bounds for the sparse stochastic block model , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[59]  Ryo Nojima,et al.  Analyzing Randomized Response Mechanisms Under Differential Privacy , 2016, ISC.

[60]  Neil W. Bergmann,et al.  IoT Privacy and Security Challenges for Smart Home Environments , 2016, Inf..

[61]  Andrey Brito,et al.  A Technique to provide differential privacy for appliance usage in smart metering , 2016, Inf. Sci..

[62]  Huirong Fu,et al.  Evaluating Location Privacy in Vehicular Communications and Applications , 2016, IEEE Transactions on Intelligent Transportation Systems.

[63]  Sule Yildirim Yayilgan,et al.  Security and Privacy Considerations for IoT Application on Smart Grids: Survey and Research Challenges , 2016, 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW).

[64]  Prateek Mittal,et al.  Dependence Makes You Vulnberable: Differential Privacy Under Dependent Tuples , 2016, NDSS.

[65]  Xintao Wu,et al.  Using Randomized Response for Differential Privacy Preserving Data Collection , 2016, EDBT/ICDT Workshops.

[66]  Athanasios V. Vasilakos,et al.  The Quest for Privacy in the Internet of Things , 2016, IEEE Cloud Computing.

[67]  Dennis Goeckel,et al.  Limits of location privacy under anonymization and obfuscation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[68]  Hai Liu,et al.  Spatiotemporal correlation-aware dummy-based privacy protection scheme for location-based services , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[69]  Daniel Cullina,et al.  Exact alignment recovery for correlated Erdos Renyi graphs , 2017, ArXiv.

[70]  Hossein Pishro-Nik,et al.  Achieving Perfect Location Privacy in Wireless Devices Using Anonymization , 2016, IEEE Transactions on Information Forensics and Security.

[71]  Yizhen Wang,et al.  Pufferfish Privacy Mechanisms for Correlated Data , 2016, SIGMOD Conference.

[72]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.

[73]  Raed Al-Dhubhani,et al.  Correlation analysis for geo-indistinguishability based continuous LBS queries , 2017, 2017 2nd International Conference on Anti-Cyber Crimes (ICACC).

[74]  Ninghui Li,et al.  Locally Differentially Private Protocols for Frequency Estimation , 2017, USENIX Security Symposium.

[75]  Kamalika Chaudhuri,et al.  Composition properties of inferential privacy for time-series data , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[76]  Yunhao Liu,et al.  PLP: Protecting Location Privacy Against Correlation Analyze Attack in Crowdsensing , 2017, IEEE Transactions on Mobile Computing.

[77]  Dennis Goeckel,et al.  Fundamental limits of location privacy using anonymization , 2017, 2017 51st Annual Conference on Information Sciences and Systems (CISS).

[78]  Donald F. Towsley,et al.  Towards provably invisible network flow fingerprints , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.

[79]  Daniel Cullina,et al.  Significance of Side Information in the Graph Matching Problem , 2017, ArXiv.

[80]  Hao Wang,et al.  An estimation-theoretic view of privacy , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[81]  Elza Erkip,et al.  Seeded graph matching: Efficient algorithms and theoretical guarantees , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.

[82]  Xinbing Wang,et al.  De-anonymization of Social Networks with Communities: When Quantifications Meet Algorithms , 2017, ArXiv.

[83]  Akihiko Ohsuga,et al.  Differential Private Data Collection and Analysis Based on Randomized Multiple Dummies for Untrusted Mobile Crowdsensing , 2017, IEEE Transactions on Information Forensics and Security.

[84]  Dennis Goeckel,et al.  Bayesian time series matching and privacy , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.

[85]  Nick Feamster,et al.  A Smart Home is No Castle: Privacy Vulnerabilities of Encrypted IoT Traffic , 2017, ArXiv.

[86]  Xu Chen,et al.  Capacity of Gaussian Many-Access Channels , 2016, IEEE Transactions on Information Theory.

[87]  Daniel Cullina,et al.  On the Performance of a Canonical Labeling for Matching Correlated Erdős-Rényi Graphs , 2018, ArXiv.

[88]  Dennis Goeckel,et al.  Statistical matching in the presence of anonymization and obfuscation: Non-asymptotic results in the discrete case , 2018, 2018 52nd Annual Conference on Information Sciences and Systems (CISS).

[89]  Xinbing Wang,et al.  Social Network De-anonymization with Overlapping Communities: Analysis, Algorithm and Experiments , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[90]  Prateek Mittal,et al.  Fundamental Limits of Database Alignment , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[91]  Anirban Bhattacharya,et al.  Probabilistic Community Detection With Unknown Number of Communities , 2016, Journal of the American Statistical Association.

[92]  Elza Erkip,et al.  Typicality Matching for Pairs of Correlated Graphs , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[93]  Divesh Srivastava,et al.  Marginal Release Under Local Differential Privacy , 2017, SIGMOD Conference.

[94]  Elza Erkip,et al.  Matching Graphs with Community Structure: A Concentration of Measure Approach , 2018, 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[95]  Dennis Goeckel,et al.  Privacy Against Statistical Matching: Inter-User Correlation , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[96]  Josep Domingo-Ferrer,et al.  Connecting Randomized Response, Post-Randomization, Differential Privacy and t-Closeness via Deniability and Permutation , 2018, ArXiv.

[97]  Aria Nosratinia,et al.  Community Detection With Side Information: Exact Recovery Under the Stochastic Block Model , 2018, IEEE Journal of Selected Topics in Signal Processing.

[98]  Tejas D. Kulkarni,et al.  Answering Range Queries Under Local Differential Privacy , 2018, SIGMOD Conference.

[99]  Ken R. Duffy,et al.  Privacy With Estimation Guarantees , 2017, IEEE Transactions on Information Theory.

[100]  Fatemeh Kazemi,et al.  Single-Server Single-Message Online Private Information Retrieval with Side Information , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[101]  Úlfar Erlingsson,et al.  Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity , 2018, SODA.

[102]  Dennis Goeckel,et al.  Asymptotic Limits of Privacy in Bayesian Time Series Matching , 2019, 2019 53rd Annual Conference on Information Sciences and Systems (CISS).

[103]  Elza Erkip,et al.  A Concentration of Measure Approach to Database De-anonymization , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).

[104]  Daniel Cullina,et al.  Analysis of a Canonical Labeling Algorithm for the Alignment of Correlated Erdős-Rényi Graphs , 2018, Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems.

[105]  M. Ali Vosoughi,et al.  Combined Distinguishers to Enhance the Accuracy and Success of Side Channel Analysis , 2019, 2019 IEEE International Symposium on Circuits and Systems (ISCAS).

[106]  Dennis Goeckel,et al.  Asymptotic Loss in Privacy due to Dependency in Gaussian Traces , 2018, 2019 IEEE Wireless Communications and Networking Conference (WCNC).

[107]  Hossein Pishro-Nik,et al.  Matching Anonymized and Obfuscated Time Series to Users’ Profiles , 2017, IEEE Transactions on Information Theory.

[108]  Daniel Cullina,et al.  Database Alignment with Gaussian Features , 2019, AISTATS.

[109]  Ninghui Li,et al.  Locally Differentially Private Heavy Hitter Identification , 2017, IEEE Transactions on Dependable and Secure Computing.