Analyzing the Sensitivity of Generalized Linear Models to Incomplete Outcomes via the IDE Algorithm

Incomplete data models typically involve strong untestable assumptions about the missing data distribution. As inference may critically depend on them, the importance of sensitivity analysis is well recognized. Molenberghs, Kenward, and Goetghebeur proposed a formal frequentist approach to sensitivity analysis which distinguishes ignorance due to unintended incompleteness from imprecision due to finite sampling by design. They combine both sources of variation into uncertainty. This article develops estimation tools for ignorance and uncertainty concerning regression coefficients in a complete data model when some of the intended outcome values are missing. Exhaustive enumeration of all possible imputations for the missing data requires enormous computational resources. In contrast, when the boundary of the occupied region is of greatest interest, reasonable computational efficiency may be achieved via the imputation towards directional extremes (IDE) algorithm. This is a special imputation method designed to mark the boundary of the region by maximizing the direction of change of the complete data estimator caused by perturbations to the imputed outcomes. For multi-dimensional parameters, a dimension reduction approach is considered. Additional insights are obtained by considering structures within the region, and by introducing external knowledge to narrow the boundary to useful proportions. Special properties hold for the generalized linear model. Examples from a Kenyan HIV study will illustrate the points.

[1]  D. Rubin,et al.  Bayesian inference for causal effects in randomized experiments with noncompliance , 1997 .

[2]  Roderick J. A. Little,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models: Comment , 1999 .

[3]  J. Robins,et al.  Sensitivity Analysis for Selection bias and unmeasured Confounding in missing Data and Causal inference models , 2000 .

[4]  Farah,et al.  Pooling sera to reduce the cost of HIV surveillance: a feasibility study in a rural Kenyan district , 1998, Tropical medicine & international health : TM & IH.

[5]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[6]  Peter Kooreman Bounds on the regression coefficients when a covariate is categorized. , 1993 .

[7]  M. Kenward Selection models for repeated measurements with non-random dropout: an illustration of sensitivity. , 1998, Statistics in medicine.

[8]  E. Lehmann Elements of large-sample theory , 1998 .

[9]  S Greenland,et al.  Basic methods for sensitivity analysis of biases. , 1996, International journal of epidemiology.

[10]  Roderick J. A. Little,et al.  Modeling the Drop-Out Mechanism in Repeated-Measures Studies , 1995 .

[11]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[12]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[13]  Geert Molenberghs,et al.  Sensitivity analysis for incomplete contingency tables: the Slovenian plebiscite case , 2001 .

[14]  J. H. Schuenemeyer,et al.  Generalized Linear Models (2nd ed.) , 1992 .

[15]  Geert Molenberghs,et al.  Sense and sensitivity when intended data are missing. , 1999 .

[16]  J. Pearl,et al.  Bounds on Treatment Effects from Studies with Imperfect Compliance , 1997 .

[17]  Erik V. Nordheim,et al.  Inference from Nonrandomly Missing Categorical Data: An Example from a Genetic Study on Turner's Syndrome , 1984 .

[18]  James M. Robins,et al.  Semiparametric Regression for Repeated Outcomes With Nonignorable Nonresponse , 1998 .