Protecting business intelligence and customer privacy while outsourcing data mining tasks

Nowadays data mining plays an important role in decision making. Since many organizations do not possess the in-house expertise of data mining, it is beneficial to outsource data mining tasks to external service providers. However, most organizations hesitate to do so due to the concern of loss of business intelligence and customer privacy. In this paper, we present a Bloom filter based solution to enable organizations to outsource their tasks of mining association rules, at the same time, protect their business intelligence and customer privacy. Our approach can achieve high precision in data mining by trading-off the storage requirement.

[1]  George R. Milne Privacy and Ethical Issues in Database/Interactive Marketing and Public Policy: A Research Framework and Overview of the Special Issue , 2000 .

[2]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[3]  Vassilios S. Verykios,et al.  Disclosure limitation of sensitive rules , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[4]  Chris Clifton,et al.  Privacy-preserving data mining: why, how, and when , 2004, IEEE Security & Privacy Magazine.

[5]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[6]  Chris Clifton,et al.  Using unknowns to prevent discovery of association rules , 2001, SGMD.

[7]  Jayant R. Haritsa,et al.  Maintaining Data Privacy in Association Rule Mining , 2002, VLDB.

[8]  A. Heinzl,et al.  Outsourcing of Information Systems in Small and Medium Sized Enterprises: A Test of a Multi-Theoretical Causal Model , 2002 .

[9]  Gene Tsudik,et al.  A Framework for Efficient Storage Security in RDBMS , 2004, EDBT.

[10]  Zbigniew W. Ras,et al.  Data Confidentiality Versus Chase , 2007, RSFDGrC.

[11]  Wenliang Du,et al.  Deriving private information from randomized data , 2005, SIGMOD '05.

[12]  Jiah-Shing Chen,et al.  Mining inter-organizational retailing knowledge for an alliance formed by competitive firms , 2003, Inf. Manag..

[13]  Jayant R. Haritsa,et al.  A Framework for High-Accuracy Privacy-Preserving Mining , 2005, ICDE.

[14]  Stanley Robson de Medeiros Oliveira,et al.  Privacy preserving frequent itemset mining , 2002 .

[15]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[16]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[17]  Ramakrishnan Srikant,et al.  Order preserving encryption for numeric data , 2004, SIGMOD '04.

[18]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[19]  Qi Wang,et al.  Random-data perturbation techniques and privacy-preserving data mining , 2005, Knowledge and Information Systems.

[20]  Karl N. Levitt,et al.  How to sanitize data? , 2004, 13th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises.

[21]  Padhraic Smyth,et al.  Business applications of data mining , 2002, CACM.

[22]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[23]  Qi Wang,et al.  On the privacy preserving properties of random data perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[24]  Elisa Bertino,et al.  Hiding Association Rules by Using Confidence and Support , 2001, Information Hiding.

[25]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[26]  Ron Kohavi,et al.  Real world performance of association rule algorithms , 2001, KDD '01.

[27]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[28]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[29]  Chris Clifton,et al.  When do data mining results violate privacy? , 2004, KDD.

[30]  Jie Wang,et al.  Knowledge and Information Systems REGULAR PAPER , 2006 .

[31]  Wenliang Du,et al.  Building decision tree classifier on private data , 2002 .

[32]  Osmar R. Zaïane,et al.  Protecting sensitive knowledge by data sanitization , 2003, Third IEEE International Conference on Data Mining.

[33]  Hakan Hacigümüs,et al.  Executing SQL over encrypted data in the database-service-provider model , 2002, SIGMOD '02.

[34]  C. Ordonez,et al.  Constraining and summarizing association rules in medical data , 2006 .

[35]  Hakan Hacigümüs,et al.  Providing database as a service , 2002, Proceedings 18th International Conference on Data Engineering.

[36]  Ling Qiu,et al.  An Approach to Outsourcing Data Mining Tasks while Protecting Business Intelligence and Customer Privacy , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[37]  Andrew Chi-Chih Yao,et al.  How to Generate and Exchange Secrets (Extended Abstract) , 1986, FOCS.

[38]  Hakan Hacigümüs,et al.  Efficient Execution of Aggregation Queries over Encrypted Relational Databases , 2004, DASFAA.

[39]  Osmar R. Zaïane,et al.  Algorithms for balancing privacy and knowledge discovery in association rule mining , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[40]  Ling Qiu,et al.  Individual Privacy and Organizational Privacy in Business Analytics , 2007, 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07).

[41]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[42]  Benny Pinkas,et al.  Cryptographic techniques for privacy-preserving data mining , 2002, SKDD.