The R Package hmi: A Convenient Tool for Hierarchical Multiple Imputation and Beyond

"Applications of multiple imputation have long outgrown the traditional context of dealing with item nonresponse in cross-sectional datasets. Nowadays multiple imputation is also applied to impute missing values in hierarchical datasets, address confidentiality concerns, combine data from different sources, or correct measurement errors in surveys. However, software developments did not keep up with these recent extensions. Most imputation software can only deal with item nonresponse in cross-sectional settings and extensions for hierarchical data - if available at all - are typically limited in scope. Furthermore, to our knowledge no software is currently available for dealing with measurement error using multiple imputation approaches. The R package hmi tries to close some of these gaps. It offers multiple imputation routines in hierarchical settings form any variable types (for example, nominal, ordinal, or continuous variables). It also provides imputation routines for interval data and handles a common measurement error problem in survey data: Biased inferences due to implicit rounding of the reported values. The user-friendly setup which only requires the data and optionally the specification of the analysis model of interest makes the package especially attractive for users less familiar with the peculiarities of multiple imputation. The compatibility with the popular mice package ensures that the rich set of analysis and diagnostic tools and post-imputation commands available in mice can be used easily once the data have been imputed." (Author's abstract, IAB-Doku) ((en))

[1]  Susanne Rässler,et al.  A Non‐Iterative Bayesian Approach to Statistical Matching , 2003 .

[2]  T. Raghunathan,et al.  Convergence Properties of a Sequential Regression Multiple Imputation Algorithm , 2015 .

[3]  Alexander Kowarik,et al.  Simulation of Synthetic Complex Data: The R Package simPop , 2017 .

[4]  A. Gelman,et al.  ON THE STATIONARY DISTRIBUTION OF ITERATIVE IMPUTATIONS , 2010, 1012.2902.

[5]  Consistent Cell Means for Topcoded Incomes in the Public Use March CPS (1976-2007) , 2008 .

[6]  Trevillore E. Raghunathan,et al.  IVEware: Imputation and Variance Estimation Software User Guide , 2002 .

[7]  Harvey Goldstein,et al.  REALCOM-IMPUTE Software for Multilevel Multiple Imputation with Mixed Response Types , 2011 .

[8]  Arthur B. Kennickell,et al.  Imputation of the 1989 Survey of Consumer Finances: Stochastic Relaxation and Multiple Imputation , 1997 .

[9]  Hadley Wickham,et al.  Reshaping Data with the reshape Package , 2007 .

[10]  Gillian M. Raab,et al.  synthpop: Bespoke Creation of Synthetic Data in R , 2016 .

[11]  Jerome P. Reiter BAYESIAN FINITE POPULATION IMPUTATION FOR DATA FUSION , 2012 .

[12]  MI Double Feature: Multiple Imputation to Address Nonresponse and Rounding Errors in Income Questions , 2015 .

[13]  Allan Donner,et al.  Imputation Strategies for Missing Continuous Outcomes in Cluster Randomized Trials , 2008, Biometrical journal. Biometrische Zeitschrift.

[14]  Gurprit Grover,et al.  Multiple imputation of censored survival data in the presence of missing covariates using restricted mean survival time , 2015 .

[15]  D. Rubin Using the SIR algorithm to simulate posterior distributions , 1988 .

[16]  Jörg Drechsler,et al.  Biases in multilevel analyses caused by cluster-specific fixed-effects imputation , 2018, Behavior research methods.

[17]  T. Raghunathan,et al.  Multiple Imputation of Missing Income Data in the National Health Interview Survey , 2006 .

[18]  S. Jenkins,et al.  Measuring inequality using censored data: a multiple‐imputation approach to estimation and inference , 2011 .

[19]  Joerg Drechsler Multiple Imputation of Multilevel Missing Data—Rigor Versus Simplicity , 2015 .

[20]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[21]  Ian R White,et al.  Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods , 2012, BMC Medical Research Methodology.

[22]  Stephen A. Mistler A SAS ® Macro for Applying Multiple Imputation to Multilevel Data , 2013 .

[23]  Stef van Buuren,et al.  Flexible Imputation of Missing Data , 2012 .

[24]  W. Sheppard On the Calculation of the most Probable Values of Frequency‐Constants, for Data arranged according to Equidistant Division of a Scale , 1897 .

[25]  Jens U. Hanisch Rounded responses to income questions , 2005 .

[26]  Donald B. Rubin,et al.  Statistical Matching Using File Concatenation With Adjusted Weights and Multiple Imputations , 1986 .

[27]  Craig K Enders,et al.  A Fully Conditional Specification Approach to Multilevel Imputation of Categorical and Continuous Variables , 2018, Psychological methods.

[28]  J M Taylor,et al.  Estimating the distribution of times from HIV seroconversion to AIDS using multiple imputation. Multicentre AIDS Cohort Study. , 1990, Statistics in medicine.

[29]  Clifford Anderson-Bergman,et al.  icenReg: Regression Models for Interval Censored Data in R , 2017 .

[30]  Susanne Rässler,et al.  Analyzing the changing gender wage gap based on multiply imputed right censored wages , 2005 .

[31]  Donald B. Rubin,et al.  Inference from Coarse Data via Multiple Imputation with Application to Age Heaping , 1990 .

[32]  Jerome P. Reiter,et al.  The importance of modeling the sampling design in multiple imputation for missing data , 2006 .