Association Rule Mining on Fragmented Database

Anonymization methods are an important tool to protect privacy. The goal is to release data while preventing individuals from being identified. Most approaches generalize data, reducing the level of detail so that many individuals appear the same. An alternate class of methods, including anatomy, fragmentation, and slicing, preserves detail by generalizing only the link between identifying and sensitive data. We investigate learning association rules on such a database. Association rule mining on a generalized database is challenging, as specific values are replaced with generalizations, eliminating interesting fine-grained correlations. We instead learn association rules from a fragmented database, preserving fine-grained values. Only rules involving both identifying and sensitive information are affected; we demonstrate the efficacy of learning in such environment.

[1]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[2]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[3]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[4]  Ninghui Li,et al.  Slicing: A New Approach for Privacy Preserving Data Publishing , 2009, IEEE Transactions on Knowledge and Data Engineering.

[5]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[6]  Aryya Gangopadhyay,et al.  A Privacy Protection Model for Patient Data with Multiple Sensitive Attributes , 2008, Int. J. Inf. Secur. Priv..

[7]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[8]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).