Data Privacy Games

With the growing popularity of big data applications, data mining technologies has attracted more and more attention in recent years. In the meantime, the fact that data mining may bring serious threat to individual privacy has become a major concern. How to deal with the conflict between big data and individual privacy is an urgent issue. In this chapter, we review the privacy issues related to data mining in a systematic way, and investigate various approaches that can help to protect privacy. According to the basic procedure of data mining, we identify four different types of users involved in big data applications, namely data provider, data collector, data miner and decision maker. For each type of user, we discuss its privacy concerns and the methods it can adopt to protect sensitive information. Basics of related research topics are introduced, and state-of-the-art approaches are reviewed. We also present some preliminary thoughts on future research directions. Specifically, we emphasize the game theoretical approaches that are proposed for analyzing the interactions among different users in a data mining scenario. By differentiating the responsibilities of different users with respect to information security, we’d like to provide some useful insights into the trade-off between data exploration and privacy protection.

[1]  Roksana Boreli,et al.  PrivacyCanary: Privacy-Aware Recommenders with Adaptive Input Obfuscation , 2014, 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems.

[2]  Giovanni Quattrone,et al.  An XML-Based Multiagent System for Supporting Online Recruitment Services , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[3]  Ivan Damgård,et al.  Secure Multiparty Computation and Secret Sharing , 2015 .

[4]  Dan Suciu,et al.  Query-Based Data Pricing , 2015, J. ACM.

[5]  Mohsen Guizani,et al.  Game theoretic data privacy preservation: Equilibrium and pricing , 2015, 2015 IEEE International Conference on Communications (ICC).

[6]  K. J. Ray Liu,et al.  Understanding Microeconomic Behaviors in Social Networking: An engineering view , 2012, IEEE Signal Processing Magazine.

[7]  K. J. Ray Liu,et al.  User participation game in collaborative filtering , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[8]  Brahim Chaib-draa,et al.  Satisfaction Equilibrium: Achieving Cooperation in Incomplete Information Games , 2006, Canadian Conference on AI.

[9]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[10]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[11]  K. J. Ray Liu,et al.  On Cost-Effective Incentive Mechanisms in Microtask Crowdsourcing , 2013, IEEE Transactions on Computational Intelligence and AI in Games.

[12]  Hamidou Tembine,et al.  Quality-Of-Service Provisioning in Decentralized Networks: A Satisfaction Equilibrium Approach , 2011, IEEE Journal of Selected Topics in Signal Processing.

[13]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[14]  K. J. Ray Liu,et al.  Privacy or Utility in Data Collection? A Contract Theoretic Approach , 2015, IEEE Journal of Selected Topics in Signal Processing.

[15]  Bing-Rong Lin,et al.  On Arbitrage-free Pricing for General Data Queries , 2014, Proc. VLDB Endow..

[16]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[17]  Chunxiao Jiang,et al.  A Framework for Categorizing and Applying Privacy-Preservation Techniques in Big Data Mining , 2016, Computer.

[18]  Murat Kantarcioglu,et al.  Incentive Compatible Privacy-Preserving Distributed Classification , 2012, IEEE Transactions on Dependable and Secure Computing.

[19]  Jason R. Marden,et al.  Achieving Pareto Optimality Through Distributed Learning , 2014, SIAM J. Control. Optim..

[20]  Li Yan,et al.  Privacy-preserving distributed association rule mining based on the secret sharing technique , 2010, The 2nd International Conference on Software Engineering and Data Mining.

[21]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[22]  Omar Besbes,et al.  Stochastic Multi-Armed-Bandit Problem with Non-stationary Rewards , 2014, NIPS.

[23]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[24]  Jie Lu,et al.  Multirelational Social Recommendations via Multigraph Ranking , 2017, IEEE Transactions on Cybernetics.

[25]  Taghi M. Khoshgoftaar,et al.  A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..

[26]  B. Huberman,et al.  Pricing Private Data , 2012 .

[27]  Xu Chen,et al.  Quality of Service Games for Spectrum Sharing , 2013, IEEE Journal on Selected Areas in Communications.

[28]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[29]  Dan Suciu,et al.  A theory of pricing private data , 2012, ICDT '13.

[30]  Raphael C.-W. Phan,et al.  Vickrey-Clarke-Groves for privacy-preserving collaborative classification , 2013, 2013 Federated Conference on Computer Science and Information Systems.

[31]  Devesh C. Jinwala,et al.  A game theory based repeated rational secret sharing scheme for privacy preserving distributed data mining , 2013, 2013 International Conference on Security and Cryptography (SECRYPT).

[32]  Yehuda Koren,et al.  Improved Neighborhood-based Collaborative Filtering , 2007 .

[33]  Umar Syed,et al.  Learning Prices for Repeated Auctions with Strategic Buyers , 2013, NIPS.

[34]  Brahim Chaib-draa,et al.  Learning to Play a Satisfaction Equilibrium , 2006 .

[35]  Chengqi Zhang,et al.  Rating Knowledge Sharing in Cross-Domain Collaborative Filtering , 2015, IEEE Transactions on Cybernetics.

[36]  Yao Sun,et al.  Sensing processes participation game of smartphones in participatory sensing systems , 2014, 2014 Eleventh Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[37]  David C. Parkes,et al.  Iterative combinatorial auctions: achieving economic and computational efficiency , 2001 .

[38]  Fabio Martinelli,et al.  Privacy-Utility Feature Selection as a Privacy Mechanism in Collaborative Data Classification , 2017, 2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE).

[39]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[40]  Jianhua Li,et al.  Dynamic Privacy Pricing: A Multi-Armed Bandit Approach With Time-Variant Rewards , 2017, IEEE Transactions on Information Forensics and Security.

[41]  Francesco Ricci,et al.  Acquiring user profiles from implicit feedback in a conversational recommender system , 2013, RecSys.

[42]  Dunja Mladenic,et al.  Data Sparsity Issues in the Collaborative Filtering Framework , 2005, WEBKDD.

[43]  Xiao-Bai Li,et al.  Pricing and disseminating customer data with privacy awareness , 2014, Decis. Support Syst..

[44]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[45]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[46]  Richard T. B. Ma,et al.  Distributed Caching via Rewarding: An Incentive Scheme Design in P2P-VoD Systems , 2014, IEEE Transactions on Parallel and Distributed Systems.

[47]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[48]  Iordanis Koutsopoulos,et al.  A Game Theoretic Framework for Data Privacy Preservation in Recommender Systems , 2011, ECML/PKDD.

[49]  Eric Moulines,et al.  On Upper-Confidence Bound Policies for Switching Bandit Problems , 2011, ALT.

[50]  Haralambos Mouratidis,et al.  Privacy-preserving collaborative recommendations based on random perturbations , 2017, Expert Syst. Appl..

[51]  Qing Zhao,et al.  Time-varying stochastic multi-armed bandit problems , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[52]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[53]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[54]  Robert Gibbons,et al.  A primer in game theory , 1992 .

[55]  Panagiotis Symeonidis,et al.  ClustHOSVD: Item Recommendation by Combining Semantically Enhanced Tag Clustering With Tensor HOSVD , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[56]  Nicanor Quijano,et al.  Dynamic Population Games for Optimal Dispatch on Hierarchical Microgrid Control , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[57]  M. J. van Lieshout,et al.  Personal data markets , 2014 .

[58]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[59]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[60]  Liang Xiao,et al.  Active authentication with reinforcement learning based on ambient radio signals , 2017, Multimedia Tools and Applications.

[61]  Weihua Zhuang,et al.  PHY-Layer Spoofing Detection With Reinforcement Learning in Wireless Networks , 2016, IEEE Transactions on Vehicular Technology.

[62]  Andreas Krause,et al.  Truthful incentives in crowdsourcing tasks using regret minimization mechanisms , 2013, WWW.

[63]  M. Habib Probabilistic methods for algorithmic discrete mathematics , 1998 .

[64]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[65]  Jianhua Li,et al.  Privacy Preserving Distributed Classification: A Satisfaction Equilibrium Approach , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[66]  Petros Daras,et al.  The TFC Model: Tensor Factorization and Tag Clustering for Item Recommendation in Social Tagging Systems , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[67]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[68]  Hillol Kargupta,et al.  Distributed Data Mining: Algorithms, Systems, and Applications , 2003 .

[69]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[70]  A. V. den Boer,et al.  Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions , 2013 .

[71]  Yehuda Lindell,et al.  Secure Multiparty Computation for Privacy-Preserving Data Mining , 2009, IACR Cryptol. ePrint Arch..

[72]  Chunxiao Jiang,et al.  Information Security in Big Data: Privacy and Data Mining , 2014, IEEE Access.

[73]  Atsuko Miyaji,et al.  Title Privacy-Preserving Data Mining : A Game-theoretic Approach , 2012 .

[74]  Claudia Eckert,et al.  Flash: Efficient, Stable and Optimal K-Anonymity , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[75]  Kun Liu,et al.  Multi-party, Privacy-Preserving Distributed Data Mining Using a Game Theoretic Framework , 2007, PKDD.

[76]  Jun Zhang,et al.  Lazy Collaborative Filtering for Data Sets With Missing Values , 2013, IEEE Transactions on Cybernetics.