P-Sensitive K-Anonymity with Generalization Constraints

Numerous privacy models based on the k-anonymity property and extending the k-anonymity model have been introduced in the last few years in data privacy research: l-diversity, p-sensitive k-anonymity, (α, k) anonymity, t-closeness, etc. While differing in their methods and quality of their results, they all focus first on masking the data, and then protecting the quality of the data as a whole. We consider a new approach, where requirements on the amount of distortion allowed on the initial data are imposed in order to preserve its usefulness. Our approach consists of specifying quasiidentifiers' generalization constraints, and achieving p-sensitive k-anonymity within the imposed constraints. We think that limiting the amount of allowed generalization when masking microdata is indispensable for real life datasets and applications. In this paper, the constrained p-sensitive k-anonymity model is introduced and an algorithm for generating constrained p-sensitive k-anonymous microdata is presented. Our experiments have shown that the proposed algorithm is comparable with existing algorithms used for generating p-sensitive k-anonymity with respect to the results' quality, and obviously the obtained masked microdata complies with the generalization constraints as indicated by the user.

[1]  Traian Marius Truta,et al.  Protection : p-Sensitive k-Anonymity Property , 2006 .

[2]  Yufei Tao,et al.  Preservation of proximity privacy in publishing numerical sensitive data , 2008, SIGMOD Conference.

[3]  Panos Kalnis,et al.  A framework for efficient data anonymization under privacy and accuracy constraints , 2009, TODS.

[4]  Josep Domingo-Ferrer,et al.  An Anonymity Model Achievable Via Microaggregation , 2008, Secure Data Management.

[5]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[6]  Ashwin Machanavajjhala,et al.  Worst-Case Background Knowledge for Privacy-Preserving Data Publishing , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[7]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[8]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[9]  Panos Kalnis,et al.  Fast Data Anonymization with Low Information Loss , 2007, VLDB.

[10]  Yufei Tao,et al.  Personalized privacy preservation , 2006, Privacy-Preserving Data Mining.

[11]  Ke Wang,et al.  On optimal anonymization for l+-diversity , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[12]  Alina Campan,et al.  User-controlled generalization boundaries for p-sensitive k-anonymity , 2010, SAC '10.

[13]  Raghu Ramakrishnan,et al.  Privacy Skyline: Privacy with Multidimensional Adversarial Knowledge , 2007, VLDB.

[14]  Elisa Bertino,et al.  Efficient k -Anonymization Using Clustering Techniques , 2007, DASFAA.

[15]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[16]  Chris Clifton,et al.  Hiding the presence of individuals from shared databases , 2007, SIGMOD '07.

[17]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[18]  William E. Winkler 20. Matching and Record Linkage , 2011 .

[19]  T. Truta,et al.  Constrained k-Anonymity : Privacy with Generalization Boundaries , 2022 .

[20]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[21]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[22]  Raymond Chi-Wing Wong,et al.  Minimality Attack in Privacy Preserving Data Publishing , 2007, VLDB.

[23]  Yansheng Lu,et al.  (t, λ)-Uniqueness: Anonymity Management for Data Publication , 2008, Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008).

[24]  Tamir Tassa,et al.  Efficient Anonymizations with Enhanced Utility , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[25]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[26]  Samir Khuller,et al.  Achieving anonymity via clustering , 2006, PODS '06.

[27]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[28]  Elisa Bertino,et al.  ARUBA: A Risk-Utility-Based Algorithm for Data Disclosure , 2008, Secure Data Management.

[29]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[30]  Grigorios Loukides,et al.  Towards Preference-Constrained k-Anonymisation , 2009, DASFAA Workshops.

[31]  Alina Campan,et al.  K-anonymization incremental maintenance and optimization techniques , 2007, SAC '07.

[32]  A. Meyer The Health Insurance Portability and Accountability Act. , 1997, Tennessee medicine : journal of the Tennessee Medical Association.

[33]  Bradley Malin,et al.  COAT: COnstraint-based anonymization of transactions , 2010, Knowledge and Information Systems.

[34]  KarrasPanagiotis,et al.  A framework for efficient data anonymization under privacy and accuracy constraints , 2009 .

[35]  Qing Zhang,et al.  Aggregate Query Answering on Anonymized Tables , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[36]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[37]  John Miller,et al.  A Clustering Approach for Achieving Data Privacy , 2007, DMIN.