An Efficient Approximate Protocol for Privacy-Preserving Association Rule Mining

The secure scalar product (or dot product) is one of the most used sub-protocols in privacy-preserving data mining. Indeed, the dot product is probably the most common sub-protocol used. As such, a lot of attention has been focused on coming up with secure protocols for computing it. However, an inherent problem with these protocols is the extremely high computation cost --- especially when the dot product needs to be carried out over large vectors. This is quite common in vertically partitioned data, and is a real problem. In this paper, we present ways to efficiently compute the approximate dot product. We implement the dot product protocol and demonstrate the quality of the approximation. Our dot product protocol can be used to securely and efficiently compute association rules from data vertically partitioned between two parties.

[1]  Wenliang Du,et al.  Privacy-preserving cooperative statistical analysis , 2001, Seventeenth Annual Computer Security Applications Conference.

[2]  Chris Clifton,et al.  Privacy Preserving Data Mining (Advances in Information Security) , 2005 .

[3]  Ling Qiu,et al.  Preserving privacy in association rule mining with bloom filters , 2006, Journal of Intelligent Information Systems.

[4]  Mikhail J. Atallah,et al.  A secure protocol for computing dot-products in clustered and distributed environments , 2002, Proceedings International Conference on Parallel Processing.

[5]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[6]  Chris Clifton,et al.  Secure set intersection cardinality with application to association rule mining , 2005, J. Comput. Secur..

[7]  Oded Goldreich,et al.  Foundations of Cryptography: General Cryptographic Protocols , 2004 .

[8]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[9]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  Alfred Menezes,et al.  Handbook of Applied Cryptography , 2018 .

[11]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[12]  Mihir Bellare Advances in Cryptology — CRYPTO 2000 , 2000, Lecture Notes in Computer Science.

[13]  Bart Goethals,et al.  On Private Scalar Product Computation for Privacy-Preserving Data Mining , 2004, ICISC.

[14]  Qi Wang,et al.  On the privacy preserving properties of random data perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[15]  Wenliang Du,et al.  Deriving private information from randomized data , 2005, SIGMOD '05.

[16]  Daniel A. Keim,et al.  Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining , 2002, KDD.

[17]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[18]  P. Ravikumar and W. W. Cohen and S. E. Fienberg,et al.  A Secure Protocol for Computing String Distance Metrics , 2004 .

[19]  Choonsik Park,et al.  Information Security and Cryptology - ICISC 2004, 7th International Conference, Seoul, Korea, December 2-3, 2004, Revised Selected Papers , 2005, ICISC.