The Impact of Nonignorable Missing Data on the Inference of Regression Coefficients.

Various statistical methods have been available to deal with missing data problems, but the difficulty is that they are based on somewhat restrictive assumptions that missing patterns are known or can be modeled with auxiliary information. This paper treats the presence of missing cases from the viewpoint that generalization as a sample does not fully represent the target population. An index is developed to detect the impact of missing data on the inference of regression coefficients in terms of statistical test/significance. It is considered that the population consists of two separable subpopulations, one in which a linear relationship among variables of interest differs and one in which a sample from the populations under represents or over represents one of subpopulations. In order to derive the index of the impact of missing data, four hypothetical situations of simple regression are considered, and the expansion to a multivariate situation is provided. In addition, the features of this index are discussed in comparison with other statistical methods for missing data such as propensity scores, nonparametric models, and Fail-Safe N. (Contains 1 table, 7 figures, and 36 references.) (SLD) Reproductions supplied by EDRS are the best that can be made from the original document. 1 PERMISSION TO REPRODUCE AND DISSEMINATE THIS MATERIAL HAS BEEN GRANTED BY