Suppressing data sets to prevent discovery of association rules

Enterprises have been collecting data for many reasons including better customer relationship management, and high-level decision making. Public safety was another motivation for large-scale data collection efforts initiated by government agencies. However, such widespread data collection efforts coupled with powerful data analysis tools raised concerns about privacy. This is due to the fact that collected data may contain confidential information. One method to ensure privacy is to selectively hide confidential information from the data sets to be disclosed. In this paper, we focus on hiding confidential correlations. We introduce a heuristic to reduce the information loss and propose a blocking method that prevents discovery of confidential correlations while preserving the usefulness of the data set.

[1]  Osmar R. Zaïane,et al.  Protecting sensitive knowledge by data sanitization , 2003, Third IEEE International Conference on Data Mining.

[2]  Elisa Bertino,et al.  Association rule hiding , 2004, IEEE Transactions on Knowledge and Data Engineering.

[3]  Chris Clifton,et al.  Using unknowns to prevent discovery of association rules , 2001, SGMD.