Subsample ignorable likelihood for regression analysis with missing data

Summary. Two common approaches to regression with missing covariates are complete-case analysis and ignorable likelihood methods. We review these approaches and propose a hybrid class, called subsample ignorable likelihood methods, which applies an ignorable likelihood method to the subsample of observations that are complete on one set of variables, but possibly incomplete on others. Conditions on the missing data mechanism are presented under which subsample ignorable likelihood gives consistent estimates, but both complete-case analysis and ignorable likelihood methods are inconsistent. We motivate and apply the method proposed to data from the National Health and Nutrition Examination Survey, and we illustrate properties of the methods by simulation. Extensions to non-likelihood analyses are also mentioned.

[1]  Joseph G Ibrahim,et al.  Theory and Inference for Regression Models with Missing Responses and Covariates. , 2008, Journal of multivariate analysis.

[2]  T. Raghunathan,et al.  Multiple Imputation of Missing Income Data in the National Health Interview Survey , 2006 .

[3]  George S. Tolley,et al.  The Interdependence between Income and Education , 1971, Journal of Political Economy.

[4]  S. Lipsitz,et al.  Missing-Data Methods for Generalized Linear Models , 2005 .

[5]  Walter R. Gilks,et al.  A Language and Program for Complex Bayesian Modelling , 1994 .

[6]  M. Gulliford,et al.  Socioeconomic inequality in blood pressure and its determinants: cross-sectional data from Trinidad and Tobago , 2004, Journal of Human Hypertension.

[7]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[8]  Joseph G. Ibrahim,et al.  Missing covariates in generalized linear models when the missing data mechanism is non‐ignorable , 1999 .

[9]  Paul T. von Hippel Regression with missing Ys: An improved strategy for analyzing multiply imputed data , 2007 .

[10]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[11]  J. Mackenbach,et al.  The epidemiologic transition theory. , 1994, Journal of epidemiology and community health.

[12]  Susan Egerter,et al.  Potential Implications of Missing Income Data in Population-Based Surveys: An Example from a Postpartum Survey in California , 2007, Public health reports.

[13]  N. Poulter,et al.  Socio-economic status and blood pressure: an overview analysis , 1998, Journal of Human Hypertension.

[14]  Qingxia Chen,et al.  Sieve Maximum Likelihood Estimation for Regression Models With Covariates Missing at Random , 2007 .

[15]  R. Little,et al.  Pattern-mixture models for multivariate incomplete data with covariates. , 1996, Biometrics.

[16]  Roderick J. A. Little Regression with Missing X's: A Review , 1992 .

[17]  Joseph G. Ibrahim,et al.  Bayesian methods for generalized linear models with covariates missing at random , 2002 .

[18]  R. Little,et al.  Maximum likelihood inference for multiple regression with missing values , 1979 .

[19]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[20]  John Van Hoewyk,et al.  A multivariate technique for multiply imputing missing values using a sequence of regression models , 2001 .

[21]  R. Little Pattern-Mixture Models for Multivariate Incomplete Data , 1993 .

[22]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .