Safely Delegating Data Mining Tasks

Data mining is playing an important role in decision making for business activities and governmental administration. Since many organizations or their divisions do not possess the in-house expertise and infrastructure for data mining, it is beneficial to delegate data mining tasks to external service providers. However, the organizations or divisions may lose of private information during the delegating process. In this paper, we present a Bloom filter based solution to enable organizations or their divisions to delegate the tasks of mining association rules while protecting data privacy. Our approach can achieve high precision in data mining by only trading-off storage requirements, instead of by trading-off the level of privacy preserving.

[1]  Yen-Liang Chen,et al.  Market basket analysis in a multiple store environment , 2005, Decis. Support Syst..

[2]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[3]  Chris Clifton,et al.  Using unknowns to prevent discovery of association rules , 2001, SGMD.

[4]  Vassilios S. Verykios,et al.  Disclosure limitation of sensitive rules , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[5]  Ramakrishnan Srikant,et al.  Order preserving encryption for numeric data , 2004, SIGMOD '04.

[6]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[7]  Benny Pinkas,et al.  Cryptographic techniques for privacy-preserving data mining , 2002, SKDD.

[8]  Jayant R. Haritsa,et al.  Maintaining Data Privacy in Association Rule Mining , 2002, VLDB.

[9]  Padhraic Smyth,et al.  Business applications of data mining , 2002, CACM.

[10]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[11]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[12]  Chris Clifton,et al.  When do data mining results violate privacy? , 2004, KDD.

[13]  Balaji Padmanabhan,et al.  On the Use of Optimization for Data Mining: Theoretical Interactions and eCRM Opportunities , 2003, Manag. Sci..

[14]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[15]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[16]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[17]  Jiah-Shing Chen,et al.  Mining inter-organizational retailing knowledge for an alliance formed by competitive firms , 2003, Inf. Manag..

[18]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[19]  Ling Qiu,et al.  Preserving privacy in association rule mining with bloom filters , 2006, Journal of Intelligent Information Systems.

[20]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[21]  Osmar R. Zaïane,et al.  Protecting sensitive knowledge by data sanitization , 2003, Third IEEE International Conference on Data Mining.

[22]  Qi Wang,et al.  On the privacy preserving properties of random data perturbation techniques , 2003, Third IEEE International Conference on Data Mining.