Towards Balancing Data Usefulness and Privacy Protection in K-Anonymisation

K-anonymisation, as an approach to protecting data privacy, has received much recent attention from the database research community. Given a single table, there can be many ways to anonymise it. So criteria for determining a preferred solution is important. Various techniques have been proposed, all attempting to achieve some form of optimality in k-anonymisation, but few have considered the balance between the usefulness of the anonymised data and the protection of the original. In this paper, we address this issue and propose a two-step approach which allows data usefulness and privacy protection requirements to be considered and balanced in k-anonymisation.

[1]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[2]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[3]  Philip S. Yu,et al.  Top-down specialization for information and privacy preservation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[4]  Osmar R. Zaïane,et al.  Privacy Preserving Clustering by Data Transformation , 2010, J. Inf. Data Manag..

[5]  Hui Wang,et al.  A novel generic clustering method based on spatial operations , 2006 .

[6]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[9]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[10]  Philip S. Yu,et al.  A Condensation Approach to Privacy Preserving Data Mining , 2004, EDBT.

[11]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[12]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[13]  Josep Domingo-Ferrer,et al.  Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation , 2005, Data Mining and Knowledge Discovery.