Statistical Matching using Fractional Imputation

Statistical matching is a technique for integrating two or more data sets when information available for matching records for individual participants across data sets is incomplete. Statistical matching can be viewed as a missing data problem where a researcher wants to perform a joint analysis of variables that are never jointly observed. A conditional independence assumption is often used to create imputed data for statistical matching. We consider an alternative approach to statistical matching without using the conditional independence assumption. We apply parametric fractional imputation of Kim (2011) to create imputed data using an instrumental variable assumption to identify the joint distribution. We also present variance estimators appropriate for the imputation procedure. We explain how the method applies directly to the analysis of data from split questionnaire designs and measurement error models.

[1]  Jun Shao,et al.  Jackknife Variance Estimation for Nearest-Neighbor Imputation , 2001 .

[2]  David Haziza,et al.  Imputation and Inference in the Presence of Missing Data , 2009 .

[3]  Christopher Winship,et al.  Counterfactuals and Causal Inference: Methods and Principles for Social Research , 2007 .

[4]  Shu Yang,et al.  Fractional hot deck imputation for robust inference under item nonresponse in survey sampling , 2014 .

[5]  Statistical matching : a model based approach for data integration , 2013 .

[6]  Marcello D'Orazio,et al.  Statistical Matching: Theory and Practice , 2006 .

[7]  J. N. K. Rao,et al.  Combining data from two independent surveys: a model-assisted approach , 2012 .

[8]  Ken Baker,et al.  Data Fusion: An Appraisal and Experimental Evaluation , 1997 .

[9]  R. Little,et al.  Regression analysis with covariates that have heteroscedastic measurement error , 2011, Statistics in medicine.

[10]  William E. Winkler,et al.  Data quality and record linkage techniques , 2007 .

[11]  James O. Chipperfield,et al.  Design and Estimation for Split Questionnaire Surveys , 2009 .

[12]  M. Wedel,et al.  Split Questionnaire Design , 2005 .

[13]  J. Ibrahim Incomplete Data in Generalized Linear Models , 1990 .

[14]  Martyn Plummer,et al.  JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , 2003 .

[15]  John L. Eltinge,et al.  Adaptive Matrix Sampling for the Consumer Expenditure Quarterly Interview Survey , 2008 .

[16]  Trivellore E. Raghunathan,et al.  A Split Questionnaire Survey Design , 1995 .

[17]  P. Lahiri,et al.  Regression Analysis With Linked Data , 2005 .

[18]  Susanne Rässler,et al.  Statistical Matching: "A Frequentist Theory, Practical Applications, And Alternative Bayesian Approaches" , 2002 .

[19]  D. Boos On Generalized Score Tests , 1992 .

[20]  Marcello D'Orazio,et al.  Statistical Matching: Theory and Practice (Wiley Series in Survey Methodology) , 2006 .

[21]  Jae Kwang Kim Parametric fractional imputation for missing data analysis , 2011 .

[22]  Chris J. Skinner,et al.  QUASI-SCORE TESTS WITH SURVEY DATA , 1998 .

[23]  C. Moriarity,et al.  Statistical Matching: A Paradigm for Assessing the Uncertainty in the Procedure , 2001 .

[24]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[25]  G. Ridder,et al.  The Econometrics of Data Combination , 2007 .

[26]  J. Beaumont,et al.  Variance estimation when donor imputation is used to fill in missing values , 2009 .