An Effective Data Transformation Approach for Privacy Preserving Clustering

A new stream of research privacy preserving data mining emerged due to the recent advances in data mining, Internet and security technologies. Data sharing among organizations considered to be useful which offer mutual benefit for business growth. Preserving the privacy of shared data for clustering was considered as the most challenging problem. To overcome the problem, the data owner published the data by random modification of the original data in certain way to disguise the sensitive information while preserving the particular data property. Data transformation techniques played a vital role to preserve privacy in data mining. We put forward an effective approach which defeats the problem of addressing privacy of confidential categorical data in clustering. A set of hybrid data transformations are introduced (HDTTR and HDTSR) and the effectiveness of the approach has been analyzed. A complete analysis of the proposed approach and a formal study of the problem have been done. Our proposed approach illustrates the effectiveness of clustering of sensitive categorical data before and after the transformation.

[1]  N. Uma,et al.  A Hybrid Data Transformation Approach for Privacy Preserving Clustering of Categorical Data , 2007 .

[2]  Elisa Bertino,et al.  Database security - concepts, approaches, and challenges , 2005, IEEE Transactions on Dependable and Secure Computing.

[3]  Ljiljana Brankovic,et al.  Data Swapping: Balancing Privacy against Precision in Mining for Logic Rules , 1999, DaWaK.

[4]  Pat Jefferies Multimedia, cyberspace and ethics , 2000, 2000 IEEE Conference on Information Visualization. An International Conference on Computer Visualization and Graphics.

[5]  Osmar R. Zaïane,et al.  Privacy Preserving Clustering by Data Transformation , 2010, J. Inf. Data Manag..

[6]  Rebecca N. Wright,et al.  A New Privacy-Preserving Distributed k-Clustering Algorithm , 2006, SDM.

[7]  H JohnGeorge Behind-the-scenes data mining , 1999 .

[8]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[9]  Keke Chen,et al.  Privacy preserving data classification with rotation perturbation , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[10]  Yücel Saygin,et al.  Privacy Preserving Clustering on Horizontally Partitioned Data , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[11]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[12]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[13]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[14]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[15]  Rathindra Sarathy,et al.  A General Additive Data Perturbation Method for Database Security , 1999 .

[16]  Ljiljana Brankovic,et al.  PRIVACY ISSUES IN KNOWLEDGE DISCOVERY AND DATA MINING , 2000 .

[17]  Stanley R. M. Oliveira,et al.  Privacy-Preserving Clustering by Object Similarity-Based Representation and Dimensionality Reduction Transformation , 2004 .

[18]  Mary J. Culnan,et al.  "How Did They Get My Name?": An Exploratory Investigation of Consumer Attitudes Toward Secondary Information Use , 1993, MIS Q..