Evaluation of graphical diagnostics for assessing goodness of fit of logistic regression models

The aim of the current work was to evaluate graphical diagnostics for assessment of the fit of logistic regression models. Assessment of goodness of fit of a model to the data set is essential to ensure the model provides an acceptable description of the binary variables seen. For logistic regression the most common diagnostic used for this purpose is binning the data and comparing the empirical probability of the occurrence of a dependent variable with the model predicted probability against the mean covariate value in the bin. Although intuitively appealing this method, which we term simple binning, may not have consistent properties for diagnosing model problems. In this report we describe and evaluate two different diagnostic procedures, random binning and simplified Bayes marginal model plots. These procedures were assessed via simulation under three different designs. Design 1: studies which were balanced on binary variables and a continuous covariate. Design 2: studies that were balanced on binary variables but unbalanced on the continuous covariate. Design 3: studies that were unbalanced on both the binary variables and the covariate. Each simulated study consisted of 500 individuals. Thirty studies were simulated. The covariate of interest was dose which could range from 0 to 20 units. The data were simulated with the dose being related to the outcome according to an Emax model on the logit scale. A logit Emax model (correct model) and a logit linear model (wrong model) were fitted to all data sets. The performance of the above diagnostics, in addition to simple binning, was compared. For all designs the proposed diagnostics performed at least as well and in many instances better than simple binning. In case of design 1 random binning and simple binning are identical. In the case of designs 2 and 3 random binning and simplified Bayes marginal model plots were superior in assessing the model fit when compared to simple binning. For the examples tested, both random binning and simplified Bayesian marginal model plots performed acceptably.

[1]  N. Holford,et al.  Investigations using logistic regression models on the effect of the LMA on morphine induced vomiting after tonsillectomy , 2000, Paediatric anaesthesia.

[2]  A. Agresti Categorical data analysis , 1993 .

[3]  France Mentré,et al.  Computing normalised prediction distribution errors to evaluate nonlinear mixed-effect models: The npde add-on package for R , 2008, Comput. Methods Programs Biomed..

[4]  James McGree,et al.  Probability-based optimal design , 2008 .

[5]  France Mentré,et al.  Prediction Discrepancies for the Evaluation of Nonlinear Mixed-Effects Models , 2006, Journal of Pharmacokinetics and Pharmacodynamics.

[6]  D. Pregibon Logistic Regression Diagnostics , 1981 .

[7]  D. Verotta,et al.  LINEAR MIXED-EFFECT MULTIVARIATE ADAPTIVE REGRESSION SPLINES APPLIED TO NONLINEAR PHARMACOKINETICS DATA , 2000, Journal of biopharmaceutical statistics.

[8]  Nicholas H. G. Holford,et al.  The Visual Predictive Check Superiority to Standard Diagnostic (Rorschach) Plots , 2005 .

[9]  D. Pregibon,et al.  Graphical Methods for Assessing Logistic Regression Models , 1984 .

[10]  D. Collett Modelling Binary Data , 1991 .

[11]  Andrew C. Hooker,et al.  Conditional Weighted Residuals (CWRES): A Model Diagnostic for the FOCE Method , 2007, Pharmaceutical Research.

[12]  L. Sheiner,et al.  Evaluating Pharmacokinetic/Pharmacodynamic Models Using the Posterior Predictive Check , 2001, Journal of Pharmacokinetics and Pharmacodynamics.

[13]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[14]  Introduction to Categorical Data Analysis Procedures Introduction to Categorical Data Analysis Procedures , 1999 .

[15]  R. Cook,et al.  A Graphical Method for Assessing the Fit of a Logistic Regression Model , 2002 .

[16]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[17]  Daniel R. Eno,et al.  Scatterplots for Logistic Regression , 1999 .