Nonparametric imputation method for nonresponse in surveys

Many imputation methods are based on a statistical model that assumes the variable of interest is a noisy observation of a function of the auxiliary variables or covariates. Misspecification of this function may lead to severe errors in estimation and to misleading conclusions. Imputation techniques can therefore benefit from flexible formulations that can capture a wide range of patterns. We consider the use of smoothing splines within an additive model framework to estimate the functional dependence between the variable of interest and the auxiliary variables. The estimator obtained allows us to build an imputation model in the case of multiple auxiliary variables. The performance of our method is assessed via numerical experiments involving simulated and real data.

[1]  Yuedong Wang,et al.  Smoothing Splines: Methods and Applications , 2011 .

[2]  Peter Bühlmann,et al.  MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..

[3]  R. Eubank Nonparametric Regression and Spline Smoothing , 1999 .

[4]  S. Gross MEDIAN ESTIMATION IN SAMPLE SURVEYS , 2002 .

[5]  B. Silverman,et al.  Nonparametric regression and generalized linear models , 1994 .

[6]  S. Wood mgcv:Mixed GAM Computation Vehicle with GCV/AIC/REML smoothness estimation , 2012 .

[7]  D. Stekhoven missForest: Nonparametric missing value imputation using random forest , 2015 .

[8]  J. Shao,et al.  Bootstrap for Imputed Survey Data , 1996 .

[9]  Roderick J A Little,et al.  A Review of Hot Deck Imputation for Survey Non‐response , 2010, International statistical review = Revue internationale de statistique.

[10]  Thomas C. M. Lee,et al.  Smoothing parameter selection for smoothing splines: a simulation study , 2003, Comput. Stat. Data Anal..

[11]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[12]  Chris J. Skinner,et al.  Imputation under Informative Sampling , 2016 .

[13]  S. Wood Thin plate regression splines , 2003 .

[14]  Jianhui Ning,et al.  A comparison study of nonparametric imputation methods , 2012, Stat. Comput..

[15]  C. J. Stone,et al.  Additive Regression and Other Nonparametric Models , 1985 .

[16]  Nonparametric regression estimators in complex surveys , 2015 .

[17]  S. Wood,et al.  Generalized Additive Models: An Introduction with R , 2006 .

[18]  R. Sitter A resampling procedure for complex survey data , 1992 .

[19]  Jean D. Opsomer,et al.  A kernel smoothing method of adjusting for unit non‐response in sample surveys , 2006 .

[20]  David Haziza,et al.  Inference for domains under imputation for missing survey data , 2005 .

[21]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[22]  Randy R. Sitter,et al.  Comparing three bootstrap methods for survey data , 1992 .

[23]  Jun Shao,et al.  Estimation With Survey Data Under Nonignorable Nonresponse or Informative Sampling , 2002 .

[24]  R. Little Missing-Data Adjustments in Large Surveys , 1988 .

[25]  Theophile Niyonsenga Response probability estimation , 1997 .

[26]  Bootstrap methods for imputed data from regression, ratio and hot‐deck imputation , 2014 .

[27]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[28]  Theo Niyonsenga Nonparametric estimation of response probabilities in sampling theory , 1994 .

[29]  J. Deville,et al.  On balanced random imputation in surveys , 2011 .

[30]  David Haziza,et al.  Imputation and Inference in the Presence of Missing Data , 2009 .

[31]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[32]  D. Rubin INFERENCE AND MISSING DATA , 1975 .