Privacy-Preserving Data Mining: Development and Directions

This article first describes the privacy concerns that arise due to data mining, especially for national security applications. Then we discuss privacy-preserving data mining. In particular, we view the privacy problem as a form of inference problem and introduce the notion of privacy constraints. We also describe an approach for privacy constraint processing and discuss its relationship to privacy-preserving data mining. Then we give an overview of the developments on privacy-preserving data mining that attempt to maintain privacy and at the same time extract useful information from data mining. Finally, some directions for future research on privacy as related to data mining are given.

[1]  Chris Clifton,et al.  Tools for privacy preserving distributed data mining , 2002, SKDD.

[2]  Chris Clifton,et al.  Defining Privacy for Data Mining , 2002 .

[3]  Li Liu,et al.  The Applicability of the Perturbation Model-based Privacy Preserving Data Mining for Real-world Data , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[4]  Hamid R. Nemati International Journal of Information Security and Privacy , 2007 .

[5]  Bhavani M. Thuraisingham,et al.  Data mining, national security, privacy and civil liberties , 2002, SKDD.

[6]  Joan Feigenbaum,et al.  Secure multiparty computation of approximations , 2001, TALG.

[7]  Rasool Azari,et al.  Current Security Management & Ethical Issues of Information Technology , 2003 .

[8]  Bhavani M. Thuraisingham Towards the Design of a Secure Data/Knowledge Base Management System , 1990, Data Knowl. Eng..

[9]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[10]  Chris Clifton,et al.  SECURITY AND PRIVACY IMPLICATIONS OF DATA MINING , 1996 .

[11]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[12]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[13]  Kun Liu,et al.  Privacy Sensitive Distributed Data Mining from Multi-party Data , 2003, ISI.

[14]  Matthew Morgenstern,et al.  Security and inference in multilevel database and knowledge-base systems , 1987, SIGMOD '87.

[15]  Ian Witten,et al.  Data Mining , 2000 .

[16]  Joseph Migga Kizza Securing the information infrastructure , 2007 .

[17]  Alfredo Terzoli,et al.  IPSec Overhead in Dual Stack IPv4/IPv6 Transition Mechanisms: An Analytical Study , 2007 .

[18]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[19]  Ramakrishnan Srikant,et al.  Hippocratic Databases , 2002, VLDB.

[20]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  Bhavani Thuraisingham,et al.  Data Mining: Technologies, Techniques, Tools, and Trends , 1998 .

[22]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[23]  Sheng Zhong,et al.  Anonymity-preserving data collection , 2005, KDD '05.

[24]  Chris Clifton,et al.  Privacy-preserving clustering with distributed EM mixture modeling , 2004, Knowledge and Information Systems.