A General Framework for Privacy Preserving Sequential Data Publishing

In this paper, we study the problem of privacy preserving sequential data publishing for microdata table. For example, a hospital might release patient's records in every three months. An adversary may disclose the confidential information of an individual across different publications of data sets by linking quasi-identifier attributes associated with the sensitive values. Most of the published work reduces the data utility to prevent the linking attack on published data set. In this paper, we propose a general framework and algorithm that can handle sequential data publishing issues and can protect the published data set from the linking attack. The proposed sequential algorithm satisfies l-diversity and increases the data utility during data publication. Experimental results show that the proposed framework counter the published dataset from linking attack and keep more data utility than the existing methods.

[1]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[2]  裕志 中川 Collusion-Resistant Privacy-Preserving Data Mining , 2013 .

[3]  Jun Luo,et al.  An effective value swapping method for privacy preserving data publishing , 2016, Secur. Commun. Networks.

[4]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[5]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[6]  Xiaofeng Ding,et al.  A hybrid approach to prevent composition attacks for independent data releases , 2016, Inf. Sci..

[7]  Raymond Chi-Wing Wong,et al.  Privacy-Preserving Data Publishing: An Overview , 2010, Privacy-Preserving Data Publishing: An Overview.

[8]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[9]  Qing Zhang,et al.  Aggregate Query Answering on Anonymized Tables , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Benjamin C. M. Fung,et al.  Anonymizing sequential releases , 2006, KDD '06.

[11]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[12]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[13]  Tamir Tassa,et al.  Privacy by diversity in sequential releases of databases , 2015, Inf. Sci..

[14]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[15]  Ninghui Li,et al.  Slicing: A New Approach for Privacy Preserving Data Publishing , 2009, IEEE Transactions on Knowledge and Data Engineering.