论文信息 - Matrix Masking

Matrix Masking

where A is a matrix that operates on the n cases, B is a matrix that operates on the p variables, and C is a matrix that adds perturbations or noise. Matrix masking includes a wide variety of standard approaches to SDL: (i) adding noise, i.e., theC in matrix masking transformation of equation [1]; (ii) releasing a subset of observations (delete rows from Z), i.e., sampling; (iii) cell suppression for cross-classifications; (iv) including simulated data (add rows to Z); (v) releasing a subset of variables (delete columns from Z); (vi) switching selected column values for pairs of rows (data swapping). Even when one has applied a mask to a data set, the possibilities of both identity and attribute disclosure remain, although the risks may be substantially diminished. The entry on Statistical Disclosure Limitation For Data Access focuses on four different matrix masking methods (i) sampling; (ii) recodings (e.g., collapsing rows or columns, sometimes referred to as global recoding; (iii) perturbation (including adding noise); and (iv) the use of synthetic data.

Stephen E. Fienberg | Jiashun Jin | S. Fienberg | Jiashun Jin

[1] George T. Duncan,et al. Enhancing Access to Microdata while Protecting Confidentiality: Prospects for the Future , 1991 .