ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION

In logistic regression, the demand for pseudo R measures of fit is undeniable. There are at least a half dozen such measures, with little consensus on which is preferable. Two of them, both based on the maximum likelihood, are used in almost all statistical software systems. The first, R1, has been implemented in SAS and SPSS. The second, R2, (also known as McFadden’s R, RMF , the deviance RDEV and the entropy RE) is implemented in STATA and SUDAAN as well as SPSS. Until recently these two measures have been considered independent. We will show in our presentation, which is a sequel to our SUGI 25 paper, that there exists a one-to-one correspondence between R1 and R2. If we know one of them, we know the other. The relationship between these measures of fit is required to understand which of them is preferred on a theoretical basis. To make this choice we consider our ability to interpret the measure in a reasonable way, the measure’s dependence on the base rate as well as its degree of susceptibility to overdispersion. We conclude that R2 should be regarded as the standard R measure.