Protecting Sensitive knowledge in association patterns mining

Mining association rules from huge amounts of data is an important issue in data mining, with the discovered information often being commercially valuable. Moreover, companies that conduct similar business are often willing to collaborate with each other by mining significant knowledge patterns from the collaborative datasets to gain the mutual benefit. However, in a cooperative project, some of these companies may want certain strategic or private data called sensitive patterns not to be published in the database. Therefore, before the database is released for sharing, some sensitive patterns have to be hidden in the database because of privacy or security concerns. To solve this problem, sensitive‐knowledge‐hiding (association rules hiding) problem has been discussed in the research community working on security and knowledge discovery. The aim of these algorithms is to extract as much as nonsensitive knowledge from the collaborative databases as possible while protecting sensitive information. Sensitive‐knowledge‐hiding problem was proven to be a nondeterministic polynomial‐time hard problem. After that, a lot of research has been completed to solve the problem. In this article, we will introduce and discuss the major categories of sensitive‐knowledge‐protecting methodologies. © 2011 Wiley Periodicals, Inc.

[1]  Guanling Lee,et al.  An efficient sanitization algorithm for balancing information privacy and knowledge discovery in association patterns mining , 2008, Data Knowl. Eng..

[2]  Aris Gkoulalas-Divanis,et al.  A Survey of Association Rule Hiding Methods for Privacy , 2008, Privacy-Preserving Data Mining.

[3]  Aris Gkoulalas-Divanis,et al.  An integer programming approach for frequent itemset hiding , 2006, CIKM '06.

[4]  Pingshui Wang Research on privacy preserving association rule mining a survey , 2010, 2010 2nd IEEE International Conference on Information Management and Engineering.

[5]  Wei Wang,et al.  Preserving Private Knowledge in Frequent Pattern Mining , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[6]  Stanley Robson de Medeiros Oliveira,et al.  Privacy preserving frequent itemset mining , 2002 .

[7]  Elisa Bertino,et al.  Association rule hiding , 2004, IEEE Transactions on Knowledge and Data Engineering.

[8]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[9]  Taneli Mielikäinen,et al.  On Inverse Frequent Set Mining , 2003 .

[10]  Philip S. Yu,et al.  Hiding Sensitive Frequent Itemsets by a Border-Based Approach , 2007, J. Comput. Sci. Eng..

[11]  Christian Prins,et al.  Applications of optimisation with Xpress-MP , 2002 .

[12]  Chris Clifton,et al.  SECURITY AND PRIVACY IMPLICATIONS OF DATA MINING , 1996 .

[13]  Aris Gkoulalas-Divanis,et al.  Exact Knowledge Hiding through Database Extension , 2009, IEEE Transactions on Knowledge and Data Engineering.

[14]  Ali Amiri,et al.  Dare to share: Protecting sensitive knowledge with data sanitization , 2007, Decis. Support Syst..

[15]  Bi-Ru Dai,et al.  Hiding Frequent Patterns in the Updated Database , 2010, 2010 International Conference on Information Science and Applications.

[16]  Toon Calders The complexity of satisfying constraints on databases of transactions , 2007, Acta Informatica.

[17]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[18]  Dhiren R. Patel,et al.  Maintaining privacy and data quality in privacy preserving association rule mining , 2010, 2010 Second International conference on Computing, Communication and Networking Technologies.

[19]  Arbee L. P. Chen,et al.  Hiding Sensitive Association Rules with Limited Side Effects , 2007, IEEE Transactions on Knowledge and Data Engineering.

[20]  George V. Moustakides,et al.  A MaxMin approach for hiding frequent itemsets , 2008, Data Knowl. Eng..

[21]  Yannis Theodoridis,et al.  A quantitative and qualitative ANALYSIS of blocking in association rule hiding , 2004, WPES '04.

[22]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[23]  Chris Clifton,et al.  Using unknowns to prevent discovery of association rules , 2001, SGMD.

[24]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[25]  K. Duraiswamy,et al.  Advanced Approach in Sensitive Rule Hiding , 2009 .

[26]  Arbee L. P. Chen,et al.  Hiding sensitive patterns in association rules mining , 2004, Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004..

[27]  Yücel Saygin,et al.  Privacy preserving association rule mining , 2002, Proceedings Twelfth International Workshop on Research Issues in Data Engineering: Engineering E-Commerce/E-Business Systems RIDE-2EC 2002.

[28]  Osmar R. Zaïane,et al.  Protecting sensitive knowledge by data sanitization , 2003, Third IEEE International Conference on Data Mining.

[29]  Elisa Bertino,et al.  A Framework for Evaluating Privacy Preserving Data Mining Algorithms* , 2005, Data Mining and Knowledge Discovery.

[30]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[31]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[32]  Francesco Bonchi,et al.  Hiding Sequences , 2007, 2007 IEEE 23rd International Conference on Data Engineering Workshop.

[33]  Wei Wang,et al.  Blocking Inference Channels in Frequent Pattern Sharing , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[34]  Yuhong Guo Reconstruction-Based Association Rule Hiding , 2007 .

[35]  Maria E. Orlowska,et al.  A new framework of privacy preserving data sharing , 2004 .

[36]  Yücel Saygin,et al.  Secure Association Rule Sharing , 2004, PAKDD.

[37]  Sumit Sarkar,et al.  Maximizing Accuracy of Shared Databases when Concealing Sensitive Patterns , 2005, Inf. Syst. Res..

[38]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[39]  Babak Rahbarinia,et al.  A multi-objective scheme to hide sequential patterns , 2010, 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE).

[40]  Elisa Bertino,et al.  Hiding Association Rules by Using Confidence and Support , 2001, Information Hiding.

[41]  Osmar R. Zaïane,et al.  Algorithms for balancing privacy and knowledge discovery in association rule mining , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[42]  Xueming Li,et al.  Hiding association rules based on relative-non-sensitive frequent itemsets , 2009, 2009 8th IEEE International Conference on Cognitive Informatics.

[43]  Toon Calders Itemset frequency satisfiability: Complexity and axiomatization , 2008, Theor. Comput. Sci..

[44]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[45]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[46]  Shiwei Tang,et al.  A FP-Tree-Based Method for Inverse Frequent Set Mining , 2006, BNCOD.

[47]  Guanling Lee,et al.  A novel method for protecting sensitive knowledge in association rules mining , 2005, 29th Annual International Computer Software and Applications Conference (COMPSAC'05).

[48]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[49]  Philip S. Yu,et al.  A border-based approach for hiding sensitive frequent itemsets , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[50]  Vassilios S. Verykios,et al.  Disclosure limitation of sensitive rules , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[51]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[52]  Ying Wu,et al.  Privacy Aware Market Basket Data Set Generation: A Feasible Approach for Inverse Frequent Set Mining , 2005, SDM.

[53]  Shyue-Liang Wang,et al.  Using unknowns for hiding sensitive predictive association rules , 2005, IRI -2005 IEEE International Conference on Information Reuse and Integration, Conf, 2005..

[54]  Bin Chen,et al.  A new two-phase sampling based algorithm for discovering association rules , 2002, KDD.

[55]  Shyue-Liang Wang,et al.  Hiding sensitive predictive association rules , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[56]  Chabane Djeraba,et al.  Mathematical Tools for Data Mining: Set Theory, Partial Orders, Combinatorics , 2008, Advanced Information and Knowledge Processing.

[57]  Osmar R. Zaïane,et al.  A unified framework for protecting sensitive association rules in business collaboration , 2006, Int. J. Bus. Intell. Data Min..

[58]  Toon Calders Computational complexity of itemset frequency satisfiability , 2004, PODS '04.

[59]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.