Privacy-Accuracy Trade-Off in Distributed Data Mining

An important issue in distributed data mining is privacy. It is necessary for each participant to make sure that its privacy is not disclosed to other participants or a third party. To protect privacy, one can apply a differential privacy approach to perturb the data before sharing them with others, which generally hurts the mining result. That is to say, the participant faces a trade-off between privacy and the mining result. In this chapter, we study a distributed classification scenario where a mediator builds a classifier based on the perturbed query results returned by a number of users. A game theoretical approach is proposed to analyze how users choose their privacy budgets. Specifically, interactions among users are modeled as a game in satisfaction form. And an algorithm is proposed for users to learn the satisfaction equilibrium (SE) of the game. Experimental results demonstrate that, when the differences among users’ expectations are not significant, the proposed learning algorithm can converge to an SE, at which every user achieves a balance between the accuracy of the classifier and the preserved privacy.

[1]  Yehuda Lindell,et al.  Secure Multiparty Computation for Privacy-Preserving Data Mining , 2009, IACR Cryptol. ePrint Arch..

[2]  Chunxiao Jiang,et al.  Information Security in Big Data: Privacy and Data Mining , 2014, IEEE Access.

[3]  Hamidou Tembine,et al.  Quality-Of-Service Provisioning in Decentralized Networks: A Satisfaction Equilibrium Approach , 2011, IEEE Journal of Selected Topics in Signal Processing.

[4]  Atsuko Miyaji,et al.  Title Privacy-Preserving Data Mining : A Game-theoretic Approach , 2012 .

[5]  Murat Kantarcioglu,et al.  Incentive Compatible Privacy-Preserving Distributed Classification , 2012, IEEE Transactions on Dependable and Secure Computing.

[6]  Robert Gibbons,et al.  A primer in game theory , 1992 .

[7]  Li Yan,et al.  Privacy-preserving distributed association rule mining based on the secret sharing technique , 2010, The 2nd International Conference on Software Engineering and Data Mining.

[8]  Jason R. Marden,et al.  Achieving Pareto Optimality Through Distributed Learning , 2011 .

[9]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[10]  Nirali R. Nanavati,et al.  A novel privacy-preserving scheme for collaborative frequent itemset mining across vertically partitioned data , 2015, Secur. Commun. Networks.

[11]  Yao Sun,et al.  Sensing processes participation game of smartphones in participatory sensing systems , 2014, 2014 Eleventh Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[12]  David C. Parkes,et al.  Iterative combinatorial auctions: achieving economic and computational efficiency , 2001 .

[13]  N. Rajesh,et al.  Survey on Privacy Preserving Data Mining Techniques using Recent Algorithms , 2016 .

[14]  Mohsen Guizani,et al.  Game theoretic data privacy preservation: Equilibrium and pricing , 2015, 2015 IEEE International Conference on Communications (ICC).

[15]  Chunxiao Jiang,et al.  User participation game in collaborative filtering , 2014 .

[16]  Raphael C.-W. Phan,et al.  Vickrey-Clarke-Groves for privacy-preserving collaborative classification , 2013, 2013 Federated Conference on Computer Science and Information Systems.

[17]  Xu Chen,et al.  Quality of Service Games for Spectrum Sharing , 2013, IEEE Journal on Selected Areas in Communications.

[18]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[19]  Fabio Martinelli,et al.  Privacy-Utility Feature Selection as a Privacy Mechanism in Collaborative Data Classification , 2017, 2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE).

[20]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[21]  Brahim Chaib-draa,et al.  Satisfaction Equilibrium: Achieving Cooperation in Incomplete Information Games , 2006, Canadian Conference on AI.

[22]  Jianhua Li,et al.  Privacy Preserving Distributed Classification: A Satisfaction Equilibrium Approach , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[23]  Devesh C. Jinwala,et al.  A game theory based repeated rational secret sharing scheme for privacy preserving distributed data mining , 2013, 2013 International Conference on Security and Cryptography (SECRYPT).

[24]  Kun Liu,et al.  Multi-party, Privacy-Preserving Distributed Data Mining Using a Game Theoretic Framework , 2007, PKDD.

[25]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[26]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.