An Efficient and Dynamic Concept Hierarchy Generation for Data Anonymization

Protecting individual sensitive specific information has become an area of concern over the past one decade. Several techniques like k-anonymity and l-diversity employing generalization/suppression based on concept hierarchies (CHTS) were proposed in literature. The anonymization effectiveness depends on the CHT chosen from the various CHTS possible for a given attribute. This paper proposes a model for constructing dynamic CHT for numerical attributes which can be: 1) generated on the fly for both generalization/suppression; 2) dynamically adjusted based on a given k. The anonymized data using our method yielded 12% better utility when compared to existing methods. The results obtained after experimentation support our claims and are discussed in the paper.

[1]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[2]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[3]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[4]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[5]  Alina Campan,et al.  On-the-Fly Hierarchies for Numerical Attributes in Data Anonymization , 2010, Secure Data Management.

[6]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[7]  Elisa Bertino,et al.  EFFICIENT K-ANONYMITY USING CLUSTERING TECHNIQUE , 2006 .

[8]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[9]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[10]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[12]  A. Meyer The Health Insurance Portability and Accountability Act. , 1997, Tennessee medicine : journal of the Tennessee Medical Association.

[13]  Yufei Tao,et al.  Personalized privacy preservation , 2006, Privacy-Preserving Data Mining.

[14]  Alina Campan,et al.  On-the-Fly Generalization Hierarchies for Numerical Attributes Revisited , 2011, Secure Data Management.

[15]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.