Microaggregation is a statistical disclosure control technique for protecting microdata (i.e., individual records), which are important products of statistical offices. The basic idea of microaggregation is to cluster individual records in microdata into a number of mutually exclusive groups prior to publication, and then publish the average over each group instead of individual records. Previous methods require fixed or variable group size in clustering in order to reduce information loss. However, the security aspect of microaggregation has not been extensively studied. We argue that the group size requirement is not enough for protecting the privacy of microdata. We propose a new microaggregation method, which we call secure-k-Ward, to enhance the individual's privacy. Our method, which is optimization based, minimizes information loss and overall mean deviation while at the same time guarantees that the security requirement for protecting the microdata is satisfied.
[1]
Gultekin Özsoyoglu,et al.
Statistical database design
,
1981,
TODS.
[2]
B. Sheela,et al.
Swift A New constrained optimization technique
,
1975
.
[3]
Ton de Waal,et al.
Statistical Disclosure Control in Practice
,
1996
.
[4]
Clement T. Yu,et al.
A study on the protection of statistical data bases
,
1977,
SIGMOD '77.
[5]
Josep Domingo-Ferrer,et al.
Practical Data-Oriented Microaggregation for Statistical Disclosure Control
,
2002,
IEEE Trans. Knowl. Data Eng..
[6]
J. Domingo-Ferrer,et al.
A COMPARATIVE STUDY OF MICROAGGREGATION METHODS
,
1998
.
[7]
P. Y. Chin,et al.
Security is partitioned dynamic stastical databases
,
1979,
COMPSAC.
[8]
Jan Schlörer,et al.
Information Loss in Partitioned Statistical Databases
,
1983,
Comput. J..
[9]
Chong K. Liew,et al.
A data distortion by probability distribution
,
1985,
TODS.