Semi-Homogenous Generalization: Improving Homogenous Generalization for Privacy Preservation in Cloud Computing

Data security is one of the leading concerns and primary challenges for cloud computing. This issue is getting more and more serious with the development of cloud computing. However, the existing privacy-preserving data sharing techniques either fail to prevent the leakage of privacy or incur huge amounts of information loss. In this paper, we propose a novel technique, termed as linking-based anonymity model, which achieves K-anonymity with quasi-identifiers groups (QI-groups) having a size less than K. In the meanwhile, a semi-homogenous generalization is introduced to be against the attack incurred by homogenous generalization. To implement linking-based anonymization model, we propose a simple yet efficient heuristic local recoding method. Extensive experiments on real datasets are also conducted to show that the utility has been significantly improved by our approach compared with the state-of-the-art methods.

[1]  D. DeWitt,et al.  K-Anonymization as Spatial Indexing: Toward Scalable and Incremental Anonymization , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[2]  David J. DeWitt,et al.  Workload-aware anonymization , 2006, KDD '06.

[3]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[4]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[5]  Panos Kalnis,et al.  Fast Data Anonymization with Low Information Loss , 2007, VLDB.

[6]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[7]  Philip S. Yu,et al.  Can the Utility of Anonymized Data be Used for Privacy Breaches? , 2009, TKDD.

[8]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[9]  Raymond Chi-Wing Wong,et al.  Minimality Attack in Privacy Preserving Data Publishing , 2007, VLDB.

[10]  Philip S. Yu,et al.  Top-down specialization for information and privacy preservation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[11]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[12]  Jian Pei,et al.  Utility-based anonymization using local recoding , 2006, KDD '06.

[13]  Tamir Tassa,et al.  k-Anonymization Revisited , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[14]  Nikos Mamoulis,et al.  Non-homogeneous generalization in privacy preserving data publishing , 2010, SIGMOD Conference.

[15]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[16]  Daniel Kifer,et al.  Attacks on privacy and deFinetti's theorem , 2009, SIGMOD Conference.

[17]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.