Knowledge Reserving in Privacy Preserving Data Mining

We present in this paper a novel method to protect data privacy in data mining. Nowadays, privacy is becoming an increasingly important issue in many data mining applications. Among the current privacy preserving techniques, data anonymization provides a simple and effective way to protect the sensitive data. However, in most of the related algorithms, data details are lost and the result dataset is far less informative than the original one. In our method, we adopt a statistical way to anonymize the dataset and we are able to preserve not only the data details but also the useful data knowledge. We also analyze in detail the accuracy and the privacy levels of our method. Experimental results further demonstrate the effectiveness of our method by comparing it to the existing methods.

[1]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[2]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[3]  GehrkeJohannes,et al.  Privacy preserving mining of association rules , 2004 .

[4]  Qing Zhang,et al.  Aggregate Query Answering on Anonymized Tables , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[5]  Chris Clifton,et al.  Tools for privacy preserving distributed data mining , 2002, SKDD.

[6]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[7]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[8]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[9]  Daniel Kifer,et al.  Injecting utility into anonymized datasets , 2006, SIGMOD Conference.

[10]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[11]  Ali Amiri,et al.  Dare to share: Protecting sensitive knowledge with data sanitization , 2007, Decis. Support Syst..

[12]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[13]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[14]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[15]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.