Ignorance and uncertainty regions as inferential tools in a sensitivity analysis

It has long been recognised that most standard point estimators lean heavily on untestable assumptions when missing data are encountered. Statisticians have therefore advocated the use of sensitivity analysis, but paid relatively little attention to strategies for summarizing the results from such analyses, summaries that have clear interpretation, verifiable properties and feasible implementation. As a step in this direction, several authors have proposed to shift the focus of inference from point estimators to estimated intervals or regions of ignorance. These regions combine standard point estimates obtained under all possible/plausible missing data models that yield identified parameters of interest. They thus reflect the achievable information from the given data generation structure with its missing data component. The standard framework of inference needs extension to allow for a transparent study of statistical properties of such regions. In this paper we propose a definition of consistency for a region and introduce the concepts of pointwise, weak and strong coverage for larger regions which acknowledge sampling imprecision in addition to the structural lack of information. The larger regions are called uncertainty regions and quantify an overall level of information by adding imprecision due to sampling error to the estimated region of ignorance. The distinction between ignorance and sampling error is often useful, for instance when sample size considerations are made. The type of coverage required depends on the analysis goal. We provide algorithms for constructing several types of uncertainty regions, and derive general relationships between them. Based on the estimated uncertainty regions, we show how classical hypothesis tests can be performed without untestable assumptions on the missingness mechanism.

[1]  J. Copas,et al.  Inference for Non‐random Samples , 1997 .

[2]  Geert Molenberghs,et al.  Sense and sensitivity when intended data are missing. , 1999 .

[3]  William G. Cochran,et al.  Sampling Techniques, 3rd Edition , 1963 .

[4]  Peter Kooreman Bounds on the regression coefficients when a covariate is categorized. , 1993 .

[5]  M. Kenward,et al.  Informative Drop‐Out in Longitudinal Data Analysis , 1994 .

[6]  Charles F. Manski,et al.  Confidence Intervals for Partially Identified Parameters , 2003 .

[7]  Joel L. Horowitz,et al.  Computation of Bounds on Population Parameters When the Data Are Incomplete , 2003, Reliab. Comput..

[8]  Daniel O Scharfstein,et al.  On the Construction of Bounds in Prospective Studies with Missing Ordinal Outcomes: Application to the Good Behavior Game Trial , 2004, Biometrics.

[9]  S. Vansteelandt,et al.  Sense and sensitivity when correcting for observed exposures in randomized clinical trials , 2005, Statistics in medicine.

[10]  P. W. Bowman,et al.  PHS Public Health Service , 1963 .

[11]  J M Robins,et al.  Non-response models for the analysis of non-monotone non-ignorable missing data. , 1997, Statistics in medicine.

[12]  R D Gill,et al.  Non-response models for the analysis of non-monotone ignorable missing data. , 1997, Statistics in medicine.

[13]  J. Pearl,et al.  Bounds on Treatment Effects from Studies with Imperfect Compliance , 1997 .

[14]  A. V. D. Vaart Asymptotic Statistics: Delta Method , 1998 .

[15]  Daniel O Scharfstein,et al.  Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes. , 2003, Biostatistics.

[16]  Farah,et al.  Pooling sera to reduce the cost of HIV surveillance: a feasibility study in a rural Kenyan district , 1998, Tropical medicine & international health : TM & IH.

[17]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[18]  Paul R. Rosenbaum,et al.  Quantiles in Nonrandom Samples and Observational Studies , 1995 .

[19]  M. Joffe Using information on realized effects to determine prospective causal effects , 2001 .

[20]  G Molenberghs,et al.  Sensitivity Analysis for Nonrandom Dropout: A Local Influence Approach , 2001, Biometrics.

[21]  J. Horowitz,et al.  Nonparametric Analysis of Randomized Experiments with Missing Covariate and Outcome Data , 2000 .

[22]  Geert Molenberghs,et al.  Sensitivity analysis for incomplete categorical data , 2001 .

[23]  Erik V. Nordheim,et al.  Inference from Nonrandomly Missing Categorical Data: An Example from a Genetic Study on Turner's Syndrome , 1984 .

[24]  D. O. Scharfstein Adjusting for nonignorable dropout using semiparametric nonresponse models (with discussion) , 1999 .

[25]  Roderick J. A. Little,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models: Comment , 1999 .

[26]  Geert Molenberghs,et al.  Sensitivity analysis for incomplete contingency tables: the Slovenian plebiscite case , 2001 .

[27]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[28]  W. Newey,et al.  Large sample estimation and hypothesis testing , 1986 .