A Hybrid Method for k-Anonymization

K-anonymity is a model to protect public released microdata from individual identification. It requires that each record is identical to at least k-1 other records in the anonymized dataset with respect to a set of privacy-related attributes. Although it is easy to anonymize the original dataset to satisfy the requirement of k-anonymity, it is important to ensure that the anonymized dataset should preserve as much information as possible of the original dataset. To minimize the information loss due to anonymization, it is crucial to group similar data together and then anonymize each group individually. This work compares the performance of two recently proposed clustering-based techniques for k-anonymization, and proposes a hybrid of both techniques to achieve less information loss than each of the original techniques. Experimental results show that the proposed hybrid technique reduces not only the total information loss but also the variance of information loss among groups.