Privacy preserving mining of association rules

We present a framework for mining association rules from transactions consisting of categorical items where the data has been randomized to preserve privacy of individual transactions. While it is feasible to recover association rules and preserve privacy using a straightforward "uniform" randomization, the discovered rules can unfortunately be exploited to find privacy breaches. We analyze the nature of privacy breaches and propose a class of randomization operators that are much more effective than uniform randomization in limiting the breaches. We derive formulae for an unbiased support estimator and its variance, which allow us to recover itemset supports from randomized datasets, and show how to incorporate these formulae into mining algorithms. Finally, we present experimental results that validate the algorithm by applying it on real datasets.

[1]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[2]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[3]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[4]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[5]  A. Froomkin The Death of Privacy? , 2000 .

[6]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[7]  Rakesh Agrawal Data mining (Invited talk. Abstract only): crossing the Chasm , 1999, KDD '99.

[8]  Mark S. Ackerman,et al.  Beyond Concern: Understanding Net Users' Attitudes About Online Privacy , 1999, ArXiv.

[9]  Lorrie Faith Cranor,et al.  Internet privacy , 1999, CACM.

[10]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[11]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[12]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[13]  R. Agarwal Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[14]  Chris Clifton,et al.  SECURITY AND PRIVACY IMPLICATIONS OF DATA MINING , 1996 .

[15]  Ljiljana Brankovic,et al.  Data Swapping: Balancing Privacy against Precision in Mining for Logic Rules , 1999, DaWaK.

[16]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[17]  Jaideep Vaidya,et al.  Privacy preserving association rule mining in vertically partitioned data , 2002, KDD.

[18]  Jayant R. Haritsa,et al.  Maintaining Data Privacy in Association Rule Mining , 2002, VLDB.

[19]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[20]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[21]  AgrawalRakesh,et al.  Mining association rules between sets of items in large databases , 1993 .

[22]  GehrkeJohannes,et al.  Privacy preserving mining of association rules , 2004 .

[23]  W. Godwin Article in Press , 2000 .

[24]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[25]  Julius T. Tou,et al.  Information Systems , 1973, GI Jahrestagung.

[26]  Richard Conway,et al.  Selective partial access to a database , 1976, ACM '76.

[27]  Arie Shoshani,et al.  Statistical Databases: Characteristics, Problems, and some Solutions , 1982, VLDB.