Modeling Projections in Microaggregation

Microaggregation is a method used by statistical agencies to limit the disclosure of sensitive microdata. It has been proven that microaggregation is an NP-hard problem when more than one variable is microaggregated at the same time. To solve this problem in a heuristic way, a few methods based on projections have been introduced in the literature. The main drawback of such methods is that the projected axis is computed maximizing a statistical property (e.g., the global variance of the data), disregarding the fact that the aim of microaggregation is to keep the disclosure risk as low as possible for all records.

[1]  Matthew A. Jaro,et al.  Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida , 1989 .

[2]  U. Rovira,et al.  Chapter 6 A Quantitative Comparison of Disclosure Control Methods for Microdata , 2001 .

[3]  Jordi Pont-Tuset,et al.  Ordered Data Set Vectorization for Linear Regression on Data Privacy , 2007, MDAI.

[4]  Vicenç Torra,et al.  Modeling decisions - information fusion and aggregation operators , 2007 .

[5]  William E. Winkler,et al.  Disclosure Risk Assessment in Perturbative Microdata Protection , 2002, Inference Control in Statistical Databases.

[6]  Josep Domingo-Ferrer,et al.  Record linkage methods for multidatabase data mining , 2003 .

[7]  菅野 道夫,et al.  Theory of fuzzy integrals and its applications , 1975 .

[8]  Josep Domingo-Ferrer,et al.  Practical Data-Oriented Microaggregation for Statistical Disclosure Control , 2002, IEEE Trans. Knowl. Data Eng..

[9]  Josep Domingo-Ferrer,et al.  Efficient multivariate data-oriented microaggregation , 2006, The VLDB Journal.

[10]  Sumitra Mukherjee,et al.  A Polynomial Algorithm for Optimal Univariate Microaggregation , 2003, IEEE Trans. Knowl. Data Eng..

[11]  P. Doyle,et al.  Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies , 2001 .

[12]  Anco Hundepool Computational aspects of statistical confidentiality the CASC-project , 2001 .

[13]  V. Torra,et al.  Disclosure control methods and information loss for microdata , 2001 .

[14]  Josep Domingo-Ferrer,et al.  On the complexity of optimal microaggregation for statistical disclosure control , 2001 .