Improving the Hosmer-Lemeshow Goodness-of-Fit Test in Large Models with Replicated Trials

The Hosmer-Lemeshow (HL) test is a commonly used global goodness-of-fit (GOF) test that assesses the quality of the overall fit of a logistic regression model. In this paper, we give results from simulations showing that the type 1 error rate (and hence power) of the HL test decreases as model complexity grows, provided that the sample size remains fixed and binary replicates are present in the data. We demonstrate that the generalized version of the HL test by Surjanovic et al. (2020) can offer some protection against this power loss. We conclude with a brief discussion explaining the behaviour of the HL test, along with some guidance on how to choose between the two tests.

[1]  A. Tsiatis A note on a goodness-of-fit test for the logistic regression model , 1980 .

[2]  R. Lockhart,et al.  A generalized Hosmer–Lemeshow goodness-of-fit test for a family of generalized linear models , 2020, Test.

[3]  David S. Moore,et al.  Unified Large-Sample Theory of General Chi-Squared Statistics for Tests of Fit , 1975 .

[4]  G. Apolone,et al.  One model, several results: the paradox of the Hosmer-Lemeshow goodness-of-fit test for the logistic regression model. , 2000, Journal of epidemiology and biostatistics.

[5]  Nils Lid Hjort,et al.  Goodness‐of‐fit processes for logistic regression: simulation results , 2002, Statistics in medicine.

[6]  D. Hosmer,et al.  A review of goodness of fit statistics for use in the development of logistic regression models. , 1982, American journal of epidemiology.

[7]  Thomas M. Loughin,et al.  Analysis of Categorical Data with R , 2014 .

[8]  D. Hosmer,et al.  A comparison of goodness-of-fit tests for the logistic regression model. , 1997, Statistics in medicine.

[9]  S. Lemeshow,et al.  Predicting the Outcome of Intensive Care Unit Patients , 1988 .

[10]  Lixing Zhu,et al.  Model Checks for Generalized Linear Models , 2002 .

[11]  D. Hosmer,et al.  Goodness of fit tests for the multiple logistic regression model , 1980 .

[12]  Ronald P. Barry,et al.  Summary goodness‐of‐fit statistics for binary generalized linear models with noncanonical link functions , 2016, Biometrical journal. Biometrische Zeitschrift.