Security in Outsourcing of Association Rule Mining

Outsourcing association rule mining to an outside service provider brings several important benefits to the data owner. These include (i) relief from the high mining cost, (ii) minimization of demands in resources, and (iii) effective centralized mining for multiple distributed owners. On the other hand, security is an issue; the service provider should be prevented from accessing the actual data since (i) the data may be associated with private information, (ii) the frequency analysis is meant to be used solely by the owner. This paper proposes substitution cipher techniques in the encryption of transactional data for outsourcing association rule mining. After identifying the non-trivial threats to a straightforward one-to-one item mapping substitution cipher, we propose a more secure encryption scheme based on a one-to-n item mapping that transforms transactions non-deterministically, yet guarantees correct decryption. We develop an effective and efficient encryption algorithm based on this method. Our algorithm performs a single pass over the database and thus is suitable for applications in which data owners send streams of transactions to the service provider. A comprehensive cryptanalysis study is carried out. The results show that our technique is highly secure with a low data transformation cost.

[1]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[2]  Azriel Rosenfeld,et al.  Breaking substitution ciphers using a relaxation algorithm , 1979, CACM.

[3]  A.-M.A. Wahdan,et al.  Genetic algorithm cryptanalysis of the basic substitution permutation network , 2003, 2003 46th Midwest Symposium on Circuits and Systems.

[4]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[5]  Darrell Whitley,et al.  A genetic algorithm tutorial , 1994, Statistics and Computing.

[6]  Gene Tsudik,et al.  Authentication and integrity in outsourced databases , 2006, TOS.

[7]  Chris Clifton,et al.  Secure set intersection cardinality with application to association rule mining , 2005, J. Comput. Secur..

[8]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[9]  Chris Clifton,et al.  SECURITY AND PRIVACY IMPLICATIONS OF DATA MINING , 1996 .

[10]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[11]  Reihaneh Safavi-Naini,et al.  Automated Cryptanalysis of Substitution Ciphers , 1993, Cryptologia.

[12]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[13]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[14]  Richard Spillman,et al.  Use of a genetic algorithm in the crypt-analysis of simple substitution ciphers , 1993 .

[15]  Gösta Grahne,et al.  Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[16]  Min Wang,et al.  Cryptography and relational database management systems , 2001, Proceedings 2001 International Database Engineering and Applications Symposium.

[17]  John B. Kam,et al.  A database encryption system with subkeys , 1981, TODS.

[18]  Habiba Drias,et al.  Cryptanalysis of Substitution Ciphers Using Scatter Search , 2005, IWINAC.

[19]  A.M.B. Albassall,et al.  Genetic algorithm cryptanalysis of a feistel type block cipher , 2004, International Conference on Electrical, Electronic and Computer Engineering, 2004. ICEEC '04..

[20]  Gene Tsudik,et al.  A Framework for Efficient Storage Security in RDBMS , 2004, EDBT.

[21]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.