Review of research on k-anonymization technique to transform data for privacy preservation

Private information may leak through data disclosure. Although the identifier attributes are suppressed, re-identification is possible by linking remaining attributes to several sources. Transforming data using the k-anonymization technique can ensure levels of privacy. This paper reviews the studies and researches of k-anonymization technique. Firstly, the paper presents basic knowledge, the re-identification problem, related terminologies, and attribute types. The paper also reviews information loss estimations and transforming algorithms to achieve k-anonymity property. Lastly, possible attacks are described, and future studies and researches are suggested.

[1]  Claudio Bettini,et al.  JS-Reduce: Defending Your Data from Sequential Background Knowledge Attacks , 2012, IEEE Transactions on Dependable and Secure Computing.

[2]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[3]  Shigeki Yamada,et al.  An Efficient k-Anonymization Algorithm with Low Information Loss , 2013 .

[4]  George T. Duncan,et al.  Disclosure-Limited Data Dissemination , 1986 .

[5]  William E. Winkler,et al.  Disclosure Risk Assessment in Perturbative Microdata Protection , 2002, Inference Control in Statistical Databases.

[6]  Adeel Anjum,et al.  Anonymizing sequential releases under arbitrary updates , 2013, EDBT '13.

[7]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[8]  Juggapong Natwichai,et al.  Incremental privacy preservation for associative classification , 2009, CIKM-PAVLAD.

[9]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[10]  Noboru Sonehara,et al.  On Enhancing Utility in k-Anonymization , 2012 .

[11]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[12]  Philip S. Yu,et al.  Top-down specialization for information and privacy preservation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[13]  V. Torra,et al.  Comparing SDC Methods for Microdata on the Basis of Information Loss and Disclosure Risk , 2004 .

[14]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[15]  Jian Pei,et al.  Utility-based anonymization using local recoding , 2006, KDD '06.

[16]  Tamir Tassa,et al.  k-Anonymization Revisited , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Adam Meyerson,et al.  On the complexity of optimal K-anonymity , 2004, PODS.

[18]  Rajeev Motwani,et al.  Approximation Algorithms for k-Anonymity , 2005 .

[19]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[20]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[22]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[23]  Chris Clifton,et al.  Thoughts on k-Anonymization , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[24]  R. Wei,et al.  Efficient K-anonymization for privacy preservation , 2008, 2008 12th International Conference on Computer Supported Cooperative Work in Design.

[25]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.