Multiobjective-integer-programming-based Sensitive Frequent Itemsets Hiding

Due to substantial commercial benefits in the discovered frequent patterns from large databases, frequent itemsets mining has become one of the most meaningful studies in data mining. However, it also increases the risk of disclosing some sensitive patterns through the data mining process. In this paper, a multi-objective integer programming, considering both data accuracy and information loss, is proposed to solve the problem for hiding sensitive frequent itemsets. Further, we solve this optimization model by a two-phased procedure, where in the first procedure the sanitized transactions can be pinpointed and in the second procedure the sanitized items can be pinpointed. Finally, we conduct some extensive tests on publicly available real data. These experiments’ results illustrate that our approach is very effective.

[1]  Yuhong Guo Reconstruction-Based Association Rule Hiding , 2007 .

[2]  Ali Amiri,et al.  Dare to share: Protecting sensitive knowledge with data sanitization , 2007, Decis. Support Syst..

[3]  Aris Gkoulalas-Divanis,et al.  Exact Knowledge Hiding through Database Extension , 2009, IEEE Transactions on Knowledge and Data Engineering.

[4]  Sumit Sarkar,et al.  Maximizing Accuracy of Shared Databases when Concealing Sensitive Patterns , 2005, Inf. Syst. Res..

[5]  Philip S. Yu,et al.  A border-based approach for hiding sensitive frequent itemsets , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[6]  Osmar R. Zaïane,et al.  Protecting sensitive knowledge by data sanitization , 2003, Third IEEE International Conference on Data Mining.

[7]  Stanley Robson de Medeiros Oliveira,et al.  Privacy preserving frequent itemset mining , 2002 .

[8]  Sumit Sarkar,et al.  Minimizing Information Loss and Preserving Privacy , 2007, Manag. Sci..