Preserving Privacy in Data Mining using Data Distortion Approach

Data mining, the extraction of hidden predictive information from large databases, is nothing but discovering hidden value in the data warehouse. Because of the increasing ability to trace and collect large amount of personal information, privacy preserving in data mining applications has become an important concern. Data distortion is one of the well known techniques for privacy preserving data mining. The objective of these data perturbation techniques is to distort the individual data values while preserving the underlying statistical distribution properties. These techniques are usually assessed in terms of both their privacy parameters as well as its associated utility measure. In this paper, we are studying the use of non-negative matrix factorization (NMF) with sparseness constraints for data distortion.