论文信息 - Multiple Imputation and Disclosure Protection : TheCase of the 1995 Survey of Consumer Finances

Multiple Imputation and Disclosure Protection : TheCase of the 1995 Survey of Consumer Finances

Donald Rubin has suggested many times that one might multiply impute all the data in a survey as means of avoiding disclosure problems in public-use datasets. Disclosure protection in the Survey of Consumer Finances is a key issue driven by two forces. First, there are legal requirements stemming from the use of tax data in the sample design. Second, there is an ethical responsibility to protect the privacy of respondents, particularly those with small weights and highly salient characteristics. In the past, a large part of the disclosure review of the survey required tedious and detailed examination of the data. After this review, a limited number of sensitive data values were targeted for a type of constrained imputation, and other undisclosed techniques were applied. This paper looks at the results of an experimental multiple imputation of a large fraction of the SCF data using software specifically designed for the survey. In this exercise, a type of range constraint is used to limit the deviations of the imputations from the reported data. The paper will discuss the design of the imputations, and provide a preliminary review of the effects of imputation on subsequent analysis.

A. Kennickell

[1] David E. Booth,et al. Analysis of Incomplete Multivariate Data , 2000, Technometrics.

[2] A. Kennickell,et al. CONSISTENT WEIGHT DESIGN FOR THE 1989, 1992 AND 1995 SCFs, AND THE DISTRIBUTION OF WEALTH , 1999 .

[3] Arthur B. Kennickell,et al. Imputation of the 1989 Survey of Consumer Finances: Stochastic Relaxation and Multiple Imputation , 1997 .

[4] G. Fries. DISCLOSURE REVIEW AND ITS IMPLICATIONS FOR THE 1992 SURVEY OF CONSUMER FINANCES , 1997 .

[5] Stephen E. Fienberg,et al. STATISTICAL NOTIONS OF DATA DISCLOSURE AVOIDANCE AND THEIR RELATIONSHIP TO TRADITIONAL STATISTICAL METHODOLOGY: DATA SWAPPING AND LOGLINEAR MODELS , 1996 .

[6] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.