Goodness-of-fit for logistic regression models developed using data collected from a complex sampling design

Lemeshow et al. (1998) demonstrated the necessity of accounting for the sampling design when constructing logistic regression models. These “design-based” methods of estimation, that is, methods that account for the sampling design and statistical weights, are available in software packages such as Stata (Stata Corporation; College Station, TX), SUDAAN (SUDAAN Statistical Software Center; Research Triangle Park, NC), and WesVar (Westat, Inc.; Rockville, MD). Hosmer and Lemeshow (2000) illustrated the model building process for logistic regression using sample survey data. As in the “iid-based” setting, that is, when the data are assumed to be independent and identically distributed, once the final logistic regression model has been determined the adequacy of the model should be examined by overall goodness-of-fit tests and examination of influential observations. Although a variety of goodness-of-fit tests for iid-based logistic regression analysis have been proposed and their properties examined, no such goodness-of-fit test exists when modeling sample survey data. Furthermore, although Roberts et al. (1987) extended diagnostic measures (Pregibon, 1981) to the complex survey setting, diagnostics are not readily available in current software that calculates design-based logistic regression estimates. Due to the unavailability of design-based goodness-of-fit and diagnostic assessments in currently available software, Hosmer and Lemeshow (2000) suggested using iid-based diagnostics in the sample survey setting and applying any findings to the design-based model. Therefore, this dissertation was undertaken to explore three areas of design based logistic regression analysis. First, a comparison of iid-based and design-based logistic regression estimates was conducted. These estimates were compared with respect to the bias, mean absolute bias, mean squared error, and coverage probability. Second, various design-based goodness-of-fit tests were proposed and extensive simulation studies were conducted to examine their performance. These proposed tests were further compared to commonly used iid-based goodness-of-fit tests. Finally, a Stata routine was developed that calculates the design-based diagnostic statistics discussed by Roberts et al. (1987). Both a case study and a simulation study were conducted to compare iid-based and design-based diagnostic statistics. Since there is no proposed method regarding how to handle weights associated with outlying covariate patterns, two methods of redistributing the weights of deleted covariate patterns were explored.