Imputation for hierarchical datasets and responses in intervals

Obtaining reliable income information in surveys is difficult for two reasons. On the one hand, many survey respondents consider income to be sensitive information and thus are reluctant to answer questions regarding their income. If those survey participants that do not provide information on their income are systematically different from the respondents (and there is ample of research indicating that they are) results based only on the observed income values will be misleading. On the other hand, respondents tend to round their income. Especially this second source of error is usually ignored when analyzing the income information. In a recent paper, Drechsler and Kiesl (2014) illustrated that inferences based on the collected information can be biased if the rounding is ignored and suggested a multiple imputation strategy to account for the rounding in reported income. In this paper we extend their approach to also address the nonresponse problem. We illustrate the approach using the household income variable from the German panel study “Labor Market and Social Security”.

[1]  John Van Hoewyk,et al.  A multivariate technique for multiply imputing missing values using a sequence of regression models , 2001 .

[2]  Donald B. Rubin,et al.  Inference from Coarse Data via Multiple Imputation with Application to Age Heaping , 1990 .

[3]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[4]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[5]  Sandra D Griffith,et al.  Truth and Memory: Linking Instantaneous and Retrospective Self-Reported Cigarette Consumption. , 2013, The annals of applied statistics.

[6]  Joerg Drechsler,et al.  Beat the heap - an imputation strategy for valid inferences from rounded income data , 2016 .

[7]  Mauro Gallegati,et al.  Pareto's Law of Income Distribution: Evidence for Germany, the United Kingdom, and the United States , 2005 .

[8]  Monique Graf,et al.  Modeling of Income and Indicators of Poverty and Social Exclusion Using the Generalized Beta Distribution of the Second Kind , 2013 .

[9]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[10]  P. Ruud,et al.  Uncertainty causes rounding: an experimental study , 2014 .

[11]  Donald B. Rubin,et al.  Multiple imputations in sample surveys , 1978 .

[12]  Mark Trappmann,et al.  PASS – A Household Panel Survey for Research on Unemployment and Poverty , 2010 .

[13]  Roger A. Sugden,et al.  Multiple Imputation for Nonresponse in Surveys , 1988 .

[14]  A. Gelman,et al.  ON THE STATIONARY DISTRIBUTION OF ITERATIVE IMPUTATIONS , 2010, 1012.2902.

[15]  Daniel F. Heitjan,et al.  Ignorability in general incomplete-data models , 1994 .

[16]  L. Hedges,et al.  Reports of elapsed time: bounding and rounding processes in estimation. , 1990, Journal of experimental psychology. Learning, memory, and cognition.

[17]  Philip Heidelberger,et al.  Simulation Run Length Control in the Presence of an Initial Transient , 1983, Oper. Res..

[18]  Hao Wang,et al.  Modeling heaping in self‐reported cigarette counts , 2008, Statistics in medicine.

[19]  Francesca Molinari,et al.  Rounding Probabilistic Expectations in Surveys , 2010, Journal of business & economic statistics : a publication of the American Statistical Association.

[20]  Jorn-Steffen Pischke,et al.  Measurement Error and Earnings Dynamics: Some Estimates from the PSID Validation Study , 1995 .