Privacy or Utility in Data Collection? A Contract Theoretic Approach

With the growing popularity of data mining, privacy has become an issue of growing importance. Privacy can be seen as a special type of goods, in a sense that it can be traded by the owner for incentives. In this paper, we consider a private data collecting scenario where a data collector buys data from multiple data owners and employs anonymization techniques to protect data owners' privacy. Anonymization causes a decline of data utility; therefore, the data owner can only sell his data at a lower price if his privacy is better protected. Can one pursue higher data utility while maintaining acceptable privacy? How to balance the trade-off between privacy protection and data utility is an important question for big data. Considering that different data owners treat privacy differently, and their privacy preferences are unknown to the collector, we propose a contract theoretic approach for data collector to deal with the trade-off. By designing an optimal contract, the collector can make rational decisions on how to pay the data owners, and more importantly, how he should protect the owners' privacy. We show that when the collector requires a large amount of data, he should ask data owners who care privacy less to provide as much as possible data. We also find that whenever the collector requires higher utility of data or the data becomes less profitable, the collector should provide a stronger protection of the owners' privacy. Performance of the proposed contract is evaluated by both numerical simulations and real data experiments.

[1]  K. J. Ray Liu,et al.  Optimal contract design for ancillary services in vehicle-to-grid networks , 2012, 2012 IEEE Third International Conference on Smart Grid Communications (SmartGridComm).

[2]  A. Acquisti The Economics of Privacy : Theoretical and Empirical Aspects , 2013 .

[3]  Claudia Eckert,et al.  Flash: Efficient, Stable and Optimal K-Anonymity , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[4]  Luciano Messori The Theory of Incentives I: The Principal-Agent Model , 2013 .

[5]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[6]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[7]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[8]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[9]  Aaron Roth,et al.  Buying private data at auction: the sensitive surveyor's problem , 2012, SECO.

[10]  K. J. Ray Liu,et al.  A contract-based approach for ancillary services in V2G networks: Optimality and learning , 2013, 2013 Proceedings IEEE INFOCOM.

[11]  J. Laffont,et al.  The Theory of Incentives: The Principal-Agent Model , 2001 .

[12]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[13]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[14]  K. J. Ray Liu,et al.  Graphical Evolutionary Game for Information Diffusion Over Social Networks , 2013, IEEE Journal of Selected Topics in Signal Processing.

[15]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[16]  Aaron Roth,et al.  Take It or Leave It: Running a Survey When Privacy Comes at a Cost , 2012, WINE.

[17]  Kobbi Nissim,et al.  Redrawing the boundaries on purchasing data from privacy-sensitive individuals , 2014, ITCS.

[18]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[19]  Stan Matwin,et al.  Privacy-Preserving Data Mining Techniques: Survey and Challenges , 2013, Discrimination and Privacy in the Information Society.

[20]  Aaron Roth,et al.  Selling privacy at auction , 2010, EC '11.

[21]  Yu-Han Lyu,et al.  Approximately optimal auctions for selling privacy when costs are correlated with data , 2012, EC '12.