A Novel Sanitization Approach for Privacy Preserving Utility Itemset Mining

Data mining plays a vital role in today’s information world wherein it has been widely applied in various business organizations. The current trend in business collaboration demands the need to share data or mined results to gain mutual benefit. However it has also raised a potential threat of revealing sensitive information when releasing data. Data sanitization is the process to conceal the sensitive itemsets present in the source database with appropriate modifications and release the modified database. The problem of finding an optimal solution for the sanitization process which minimizes the non-sensitive patterns lost is NP-hard. Recent researches in data sanitization approaches hide the sensitive itemsets by reducing the support of the itemsets which considers only the presence or absence of itemsets. However in real world scenario the transactions contain the purchased quantities of the items with their unit price. Hence it is essential to consider the utility of itemsets in the source database. In order to address this utility mining model was introduced to find high utility itemsets. In this paper, we focus primarily on protecting privacy in utility mining. Here we consider the utility of the itemsets and propose a novel approach for sanitization such that minimal changes are made to the database with minimum number of non-sensitive itemsets removed from the database.

[1]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[2]  Yücel Saygin,et al.  Privacy preserving association rule mining , 2002, Proceedings Twelfth International Workshop on Research Issues in Data Engineering: Engineering E-Commerce/E-Business Systems RIDE-2EC 2002.

[3]  Rathindra Sarathy,et al.  A General Additive Data Perturbation Method for Database Security , 1999 .

[4]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[5]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[6]  Wei Wang,et al.  Preserving Private Knowledge in Frequent Pattern Mining , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[7]  Charu C. Aggarwal,et al.  On privacy preservation against adversarial data mining , 2006, KDD '06.

[8]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[9]  Pat Jefferies Multimedia, Cyberspace & Ethics , 2000 .

[10]  Vassilios S. Verykios,et al.  Disclosure limitation of sensitive rules , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[11]  Mary J. Culnan,et al.  "How Did They Get My Name?": An Exploratory Investigation of Consumer Attitudes Toward Secondary Information Use , 1993, MIS Q..

[12]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[13]  Stanley Robson de Medeiros Oliveira,et al.  Privacy preserving frequent itemset mining , 2002 .

[14]  Elisa Bertino,et al.  Association rule hiding , 2004, IEEE Transactions on Knowledge and Data Engineering.

[15]  Ljiljana Brankovic,et al.  Data Swapping: Balancing Privacy against Precision in Mining for Logic Rules , 1999, DaWaK.

[16]  Cory J. Butz,et al.  A Foundational Approach to Mining Itemset Utilities from Databases , 2004, SDM.

[17]  Arbee L. P. Chen,et al.  Hiding Sensitive Association Rules with Limited Side Effects , 2007 .

[18]  Ljiljana Brankovic,et al.  PRIVACY ISSUES IN KNOWLEDGE DISCOVERY AND DATA MINING , 2000 .

[19]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[20]  Ralph Arnote,et al.  Hong Kong (China) , 1996, OECD/G20 Base Erosion and Profit Shifting Project.

[21]  Elisa Bertino,et al.  Database security - concepts, approaches, and challenges , 2005, IEEE Transactions on Dependable and Secure Computing.

[22]  Ying Liu,et al.  A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets , 2005, PAKDD.

[23]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .