A Coefficient of Determination for Generalized Linear Models

ABSTRACT The coefficient of determination, a.k.a. R2, is well-defined in linear regression models, and measures the proportion of variation in the dependent variable explained by the predictors included in the model. To extend it for generalized linear models, we use the variance function to define the total variation of the dependent variable, as well as the remaining variation of the dependent variable after modeling the predictive effects of the independent variables. Unlike other definitions that demand complete specification of the likelihood function, our definition of R2 only needs to know the mean and variance functions, so applicable to more general quasi-models. It is consistent with the classical measure of uncertainty using variance, and reduces to the classical definition of the coefficient of determination when linear regression models are considered.

[1]  C. Morris Natural Exponential Families with Quadratic Variance Functions , 1982 .

[2]  Eric R. Ziegel,et al.  Analysis of Binary Data (2nd ed.) , 1991 .

[3]  N. Nagelkerke,et al.  A note on a general definition of the coefficient of determination , 1991 .

[4]  Hedley Rees,et al.  Limited-Dependent and Qualitative Variables in Econometrics. , 1985 .

[5]  David R. Cox The analysis of binary data , 1970 .

[6]  B. Jørgensen Exponential Dispersion Models , 1987 .

[7]  G. Maddala Limited-dependent and qualitative variables in econometrics: Introduction , 1983 .

[8]  A. Agresti An introduction to categorical data analysis , 1997 .

[9]  H. Akaike A new look at the statistical model identification , 1974 .

[10]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[11]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  C. Morris Natural Exponential Families with Quadratic Variance Functions: Statistical Theory , 1983 .

[14]  F. Windmeijer,et al.  An R-squared measure of goodness of fit for some common nonlinear regression models , 1997 .

[15]  L. Magee,et al.  R 2 Measures Based on Wald and Likelihood Ratio Joint Significance Tests , 1990 .

[16]  Nasir M. Rajpoot,et al.  Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images , 2016, IEEE Trans. Medical Imaging.