Controlled rounding and cell perturbation: statistical disclosure limitation methods for tabular data

Rounding methods are common techniques in many statistical offices to protect sensitive information when publishing data in tabular form. Classical versions of these methods do not consider protection levels while searching patterns with minimum information loss, and therefore typically the so-called auditing phase is required to check the protection of the proposed patterns. This paper presents a mathematical model for the whole problem of finding a protected pattern with minimum loss of information, and proposes a branch-and-cut algorithm to solve it. It also describes a new methodology closely related to the classical Controlled Rounding methods but with several advantages. The new methodology is named Cell Perturbation and leads to a different optimization problem which is simpler to solve than the previous problem. This paper presents a cutting-plane algorithm for finding an exact solution of the new problem, which is a pattern guaranteeing the same protection level requirements but with smaller loss of information when compared with the classical Controlled Rounding optimal patterns. The auditing phase is unnecessary on the solutions generated by the two algorithms. The paper concludes with computational results on real-world instances and discusses a modification in the objective function to guarantee statistical properties in the solutions.

[1]  James P. Kelly,et al.  Controlled Rounding of Tabular Data , 1990, Oper. Res..

[2]  R. Jewett DISCLOSURE ANALYSIS FOR THE 1992 ECONOMIC CENSUS , 1998 .

[3]  Anco Hundepool The CASC Project , 2002, Inference Control in Statistical Databases.

[4]  Ramesh A. Dandekar,et al.  Maximum Utility-Minimum Information Loss Table Server Design for Statistical Disclosure Control of Tabular Data , 2004, Privacy in Statistical Databases.

[5]  Lawrence H. Cox,et al.  Network Models for Complementary Cell Suppression , 1995 .

[6]  B. Causey,et al.  Applications of Transportation Theory to Statistical Problems , 1985 .

[7]  James P. Kelly,et al.  Using Simulated Annealing to Solve Controlled Rounding Problems , 1990, INFORMS J. Comput..

[8]  P. Doyle,et al.  Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies , 2001 .

[9]  James P. Kelly,et al.  Large-scale controlled rounding using tabu search with strategic oscillation , 1993, Ann. Oper. Res..

[10]  M. Bacharach Matrix Rounding Problems , 1966 .

[11]  Dale A. Robertson,et al.  Cell Suppression: Experience and Theory , 2002, Inference Control in Statistical Databases.

[12]  L. Cox A Constructive Procedure for Unbiased Controlled Rounding , 1987 .

[13]  M. Fischetti,et al.  Computational experience with the controlled rounding problem in statistical disclosure control , 1998 .

[14]  Matteo Fischetti,et al.  Partial cell suppression: A new methodology for statistical disclosure control , 2003, Stat. Comput..

[15]  Matteo Fischetti,et al.  Solving the Cell Suppression Problem on Tabular Data with Linear Constraints , 2001, Manag. Sci..

[16]  Gordon Sande,et al.  Automated Cell Suppression to Preserve Confidentiality of Business Statistics , 1983, SSDBM.

[17]  L. Willenborg,et al.  Elements of Statistical Disclosure Control , 2000 .

[18]  Josep Domingo-Ferrer,et al.  Disclosure risk assessment in statistical data protection , 2004 .

[19]  George T. Duncan,et al.  Obtaining Information while Preserving Privacy: A Markov Perturbation Method for Tabular Data , 1997 .