Systematic clustering method for l-diversity model

Nowadays privacy becomes a major concern and many research efforts have been dedicated to the development of privacy protecting technology. Anonymization techniques provide an efficient approach to protect data privacy. We recently proposed a systematic clustering method based on k-anonymization technique that minimizes the information loss and at the same time assures data quality. In this paper, we extended our previous work on the systematic clustering method to l-diversity model that assumes that every group of indistinguishable records contains at least l distinct sensitive attributes values. The proposed technique adopts to group similar data together with l-diverse sensitive values and then anonymizes each group individually. The structure of systematic clustering problem for l-diversity model is defined, investigated through paradigm and is implemented in two steps, namely clustering step for k-anonymization and l-diverse step. Finally, two algorithms of the proposed problem in two steps are developed and shown that the time complexity is in O(n/k2) in the first step, where n is the total number of records containing individuals concerning their privacy and k is the anonymity parameter for k-anonymization.

[1]  Adam Meyerson,et al.  On the complexity of optimal K-anonymity , 2004, PODS.

[2]  Hua Wang,et al.  An efficient hash-based algorithm for minimal k-anonymity , 2008, ACSC.

[3]  Hua Wang,et al.  Priority Driven K-Anonymisation for Privacy Protection , 2008, AusDM.

[4]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[5]  Grigorios Loukides,et al.  Capturing data usefulness and privacy protection in K-anonymisation , 2007, SAC '07.

[6]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[9]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[10]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[11]  Josep Domingo-Ferrer,et al.  Micro-aggregation-based heuristics for p-sensitive k-anonymity: one step beyond , 2008, PAIS '08.

[12]  Elisa Bertino,et al.  Secure Anonymization for Incremental Datasets , 2006, Secure Data Management.

[13]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[14]  Jian Pei,et al.  Utility-based anonymization using local recoding , 2006, KDD '06.

[15]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Jun-Lin Lin,et al.  An efficient clustering method for k-anonymization , 2008, PAIS '08.

[17]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[18]  Huidong Jin,et al.  Current developments of k-anonymous data releasing , 2008 .

[19]  Traian Marius Truta,et al.  Protection : p-Sensitive k-Anonymity Property , 2006 .

[20]  Elisa Bertino,et al.  Micro-views, or on how to protect privacy while enhancing data usability: concepts and challenges , 2006, SGMD.

[21]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[22]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[23]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[24]  Chieh-Yuan Tsai,et al.  A k -Anonymity Clustering Method for Effective Data Privacy Preservation , 2007, ADMA.

[25]  Philip S. Yu,et al.  Top-down specialization for information and privacy preservation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[26]  Elisa Bertino,et al.  Efficient k -Anonymization Using Clustering Techniques , 2007, DASFAA.

[27]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.