Disclosure Limitation Methods and Information Loss for Tabular Data

Even in the age of electronic dissemination of statistical data, tables are central data products of statistical agencies. For prominent examples, see the American FactFinder (http://factfinder.census.gov/servlet/BasicFactsServlet) from the U.S. Bureau of Census, the Office of National Statistics (http://www.statistics.gov.uk/) in the U.K., and Statistics Netherlands (http://www.cbs.nl/en/figures/keyfigures/index.htm). Much survey and census data is categorical in nature and thus the representation of survey results in the form of cross-classifications or tables is a natural device for statistical reporting. But even when they collect measurement data, statistical agencies often represent the information from them in the form of discretized quantities. As a result, tables of counts represent a primary unit of reporting and analysis. Sometimes these tables represent simple cross-classifications of the counts of survey and census elements. Other times the sample units are weighted according to probabilities of selection and /or are interpretable as the numbers of people in the population (based on the sample). In such tables of counts, the occurrence of small values is usually taken to present the possibility of a disclosure risk, since data for individuals who are unique in the population may be used in matching against other databases by an intruder or data snooper.

[1]  George T. Duncan,et al.  Enhancing Access to Microdata while Protecting Confidentiality: Prospects for the Future , 1991 .

[2]  L. Willenborg,et al.  Optimal Local Suppression in Microdata , 1999 .

[3]  D. Lambert Measures of Disclosure Risks and Harm , 1993 .

[4]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[5]  G. Duncan,et al.  Private Lives and Public Policies: Confidentiality and Accessibility of Government Statistics , 1993 .

[6]  George T. Duncan,et al.  Disclosure Risk vs. Data Utility: The R-U Confidentiality Map , 2003 .

[7]  Lawrence H. Cox,et al.  Network Models for Complementary Cell Suppression , 1995 .

[8]  George T. Duncan,et al.  Disclosure-Limited Data Dissemination , 1986 .

[9]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[10]  S. Fienberg,et al.  Bounds for cell entries in contingency tables induced by fixed marginal totals with applications to disclosure limitation , 2001 .

[11]  Stephen E. Fienberg,et al.  STATISTICAL NOTIONS OF DATA DISCLOSURE AVOIDANCE AND THEIR RELATIONSHIP TO TRADITIONAL STATISTICAL METHODOLOGY: DATA SWAPPING AND LOGLINEAR MODELS , 1996 .

[12]  Nancy L. Spruill THE CONFIDENTIALITY AND ANALYTIC USEFULNESS OF MASKED BUSINESS MICRODATA , 2002 .

[13]  S. Reiss,et al.  Data-swapping: A technique for disclosure control , 1982 .

[14]  Stephen E. Fienberg,et al.  Disclosure limitation using perturbation and related methods for categorical data , 1998 .

[15]  B. Causey,et al.  Applications of Transportation Theory to Statistical Problems , 1985 .

[16]  N. Dellaert,et al.  Statistical Disclosure in Two-Dimensional Tables: General Tables , 1994 .

[17]  Juan José SALAZAR-GONZÁLEZ,et al.  Modeling and Solving the Cell Suppression Problem for Linearly-Constrained Tabular Data , 1997 .

[18]  James P. Kelly,et al.  Controlled Rounding of Tabular Data , 1990, Oper. Res..

[19]  B. Golden,et al.  The controlled rounding problem: Relaxations and complexity issues , 1990 .

[20]  L. Cox A Constructive Procedure for Unbiased Controlled Rounding , 1987 .

[21]  Ag De Waal,et al.  A view on statistical disclosure control for microdata , 1996 .

[22]  Ivan P. Fellegi,et al.  On the Question of Statistical Confidentiality , 1972 .

[23]  Peter Kooiman,et al.  Post randomisation for statistical disclosure control: Theory and implementation , 1997 .

[24]  Gultekin Özsoyoglu,et al.  Information loss in the lattice model of summary tables due to cell suppression , 1986, 1986 IEEE Second International Conference on Data Engineering.

[25]  S. Fienberg Statistical Perspectives on Conÿdentiality and Data Access in Public Health , 2022 .

[26]  S. Keller-McNulty,et al.  Estimation of Identi ® cation Disclosure Risk in Microdata , 1999 .

[27]  James P. Kelly,et al.  Cell suppression: Disclosure protection for sensitive tabular data , 1992, Networks.

[28]  Ton de Waal,et al.  Statistical Disclosure Control in Practice , 1996 .

[29]  G. Paass Disclosure Risk and Disclosure Avoidance for Microdata , 1988 .

[30]  Laura Zayatz USING LINEAR PROGRAMMING METHODOLOGY FOR DISCLOSURE AVOIDANCE PURPOSES , 1992 .

[31]  Ramayya Krishnan,et al.  Disclosure Detection in Multivariate Categorical Databases: Auditing Confidentiality Protection Through Two New Matrix Operators , 1999 .

[32]  P. Diaconis,et al.  Algebraic algorithms for sampling from conditional distributions , 1998 .

[33]  Richard A. Griffin,et al.  DISCLOSURE AVOIDANCE FOR THE 1990 CENSUS , 1990 .

[34]  S E Fienberg,et al.  INAUGURAL ARTICLE by a Recently Elected Academy Member:Bounds for cell entries in contingency tables given marginal totals and decomposable graphs , 2000 .

[35]  D. Lambert,et al.  The Risk of Disclosure for Microdata , 1989 .

[36]  M. Fischetti,et al.  Models and Algorithms for Optimizing Cell Suppression in Tabular Data with Linear Constraints , 2000 .

[37]  George T. Duncan,et al.  Confidentiality and Statistical Disclosure Limitations , 2001 .

[38]  L. Cox Suppression Methodology and Statistical Disclosure Control , 1980 .

[39]  Matteo Fischetti,et al.  Experiments with Controlled Rounding for Statistical Disclosure Control in Tabular Data with Linear , 1998 .

[40]  M. Trottini A Decision-Theoretic Approach to Data Disclosure Problems , 2001 .

[41]  L. Cox Linear sensitivity measures in statistical disclosure control , 1981 .

[42]  Fritz Scheuren,et al.  PROTECTION OF TAXPAYER CONFIDENTIALITY WITH RESPECT TO THE TAX MODEL , 2002 .

[43]  Mark Elliot,et al.  Scenarios of attack: the data intruder's perspective on statistical disclosure risk , 1999 .

[44]  Lawrence R. Ernst,et al.  FURTHER APPLICATIONS OF LINEAR PROGRAMMING TO SAMPLING PROBLEMS , 2002 .

[45]  Matteo Fischetti,et al.  Models and algorithms for the 2-dimensional cell suppression problem in statistical disclosure control , 1999, Math. Program..

[46]  W. Keller,et al.  Disclosure control of microdata , 1990 .

[47]  George T. Duncan,et al.  Obtaining Information while Preserving Privacy: A Markov Perturbation Method for Tabular Data , 1997 .

[48]  A. Zaslavsky,et al.  Balancing Disclosure Risk Against the Loss of Nonpublication , 1999 .

[49]  L. Willenborg,et al.  Elements of Statistical Disclosure Control , 2000 .

[50]  William E. Winkler,et al.  Re-identification Methods for Evaluating the Confidentiality of Analytically Valid Microdata , 1998 .