Log-likelihood-based Pseudo-R2 in Logistic Regression

The literature proposes numerous so-called pseudo-R2 measures for evaluating “goodness of fit” in regression models with categorical dependent variables. Unlike ordinary least square-R2, log-likelihood-based pseudo-R2s do not represent the proportion of explained variance but rather the improvement in model likelihood over a null model. The multitude of available pseudo-R2 measures and the absence of benchmarks often lead to confusing interpretations and unclear reporting. Drawing on a meta-analysis of 274 published logistic regression models as well as simulated data, this study investigates fundamental differences of distinct pseudo-R2 measures, focusing on their dependence on basic study design characteristics. Results indicate that almost all pseudo-R2s are influenced to some extent by sample size, number of predictor variables, and number of categories of the dependent variable and its distribution asymmetry. Hence, an interpretation by goodness-of-fit benchmark values must explicitly consider these characteristics. The authors derive a set of goodness-of-fit benchmark values with respect to ranges of sample size and distribution of observations for this measure. This study raises awareness of fundamental differences in characteristics of pseudo-R2s and the need for greater precision in reporting these measures.

[1]  Margarethe F. Wiersema,et al.  The Use of Limited Dependent Variable Techniques in Strategy Research: Issues and Methods , 2009 .

[2]  Harbir Singh,et al.  Complementarity, status similarity and social capital as drivers of alliance formation , 2000 .

[3]  Kevin G. Corley,et al.  Organizational Context as a Moderator of Theories on Firm Boundaries for Technology Sourcing , 2001 .

[4]  David B. Montgomery,et al.  A Note on Adjusting R2 , 1973 .

[5]  Frank Windmeijer,et al.  Goodness-of-fit measures in binary choice models 1 , 1995 .

[6]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[7]  P. Bentler,et al.  Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternatives , 1999 .

[8]  R. Peterson A Meta-analysis of Cronbach's Coefficient Alpha , 1994 .

[9]  Michael A. Hitt,et al.  Current and Future Research Methods in Strategic Management , 1998 .

[10]  David J. Ketchen,et al.  Data analytic trends and training in strategic management , 2003 .

[11]  Ryan D. King,et al.  Citizenship and Punishment , 2014, American sociological review.

[12]  H. Kritzer,et al.  Jurisprudential Regimes in Supreme Court Decision Making , 2002, American Political Science Review.

[13]  Ivaylo D. Petev The Association of Social Class and Lifestyles , 2013 .

[14]  S. Pandey,et al.  Effect of Sample Size on Goodness-Fit of-Fit Indices in Structural Equation Models , 1995 .

[15]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[16]  G. Hoetker The use of logit and probit models in strategic management research: Critical issues , 2007 .

[17]  K. Bollen,et al.  Pearson's R and Coarsely Categorized Measures , 1981 .

[18]  Will Mitchell,et al.  Unpacking Firm Exit at the Firm and Industry Levels: The Adaptation and Selection of Firm Capabilities , 2012 .

[19]  R. Fisher,et al.  The Influence of Rainfall on the Yield of Wheat at Rothamsted , 1925 .

[20]  Arturo Estrella,et al.  A new measure of fit for equations with dichotomous dependent variables , 1998 .

[21]  Tak-Shing Harry So,et al.  The Use and Interpretation of Logistic Regression in Higher Education Journals: 1988–1999 , 2002 .

[22]  Margarethe F. Wiersema,et al.  Modelling limited dependent variables: methods and guidelines for researchers in strategic management , 2003 .

[23]  J. Marcel,et al.  Cleaning house or jumping ship? Understanding board upheaval following financial fraud , 2014 .

[24]  J. S. Long,et al.  Regression Models for Categorical and Limited Dependent Variables , 1997 .

[25]  K. E. Barron,et al.  Testing Moderator and Mediator Effects in Counseling Psychology Research. , 2004 .

[26]  John H. Goldthorpe,et al.  Social Status and Newspaper Readership1 , 2007, American Journal of Sociology.

[27]  Johann Peter Murmann,et al.  Bringing Managers into Theories of Multimarket Competition: CEOs and the Determinants of Market Entry , 2003, Organ. Sci..

[28]  Mason A. Carpenter,et al.  What’s All That (Strategic) Noise? Anticipatory Impression Management in CEO Succession , 2010 .

[29]  Timothy M. Hagle,et al.  Goodness-of-Fit Measures for Probit and Logit , 1992 .

[30]  Moshe Ben-Akiva,et al.  Discrete Choice Analysis: Theory and Application to Travel Demand , 1985 .

[31]  Klaus F. Zimmermann,et al.  Pseudo‐R 2’s in the ordinal probit model* , 1992 .

[32]  J. G. Cragg,et al.  The Demand for Automobiles , 1970 .

[33]  W. G. Cochran The $\chi^2$ Test of Goodness of Fit , 1952 .

[34]  Craig J. Russell,et al.  Moderated Regression Analysis and Likert Scales: Too Coarse for Comfort , 1992 .

[35]  Steven Andrew Culpepper,et al.  Scale Coarseness as a Methodological Artifact , 2009 .

[36]  R. McKelvey,et al.  A statistical model for the analysis of ordinal level dependent variables , 1975 .

[37]  P. Schmidt,et al.  Limited-Dependent and Qualitative Variables in Econometrics. , 1984 .

[38]  Allan P. Jones,et al.  Apples and Oranges: An Empirical Comparison of Commonly Used Indices of Interrater Agreement , 1983 .

[39]  John Gilmour Institutional and Individual Influences on the President's Veto , 2002, The Journal of Politics.

[40]  Gary King,et al.  Logistic Regression in Rare Events Data , 2001, Political Analysis.

[41]  D. Xu,et al.  Growth and Survival of International Joint Ventures: An External-Internal Legitimacy Perspective , 2006 .

[42]  Steven White,et al.  Distinguishing costs of cooperation and control in alliances , 2005 .

[43]  Alfred DeMaris,et al.  Explained Variance in Logistic Regression , 2002 .

[44]  A. Raftery Bayesian Model Selection in Social Research , 1995 .

[45]  Frank M. Bass,et al.  Empirical Generalizations and Marketing Science: A Personal View , 1995 .

[46]  Hans-Jürgen Andreß,et al.  Analyse von Tabellen und kategorialen Daten , 1997 .

[47]  Paul Mallette,et al.  Board Composition, Stock Ownership and the Exemption of Directors from Liability , 1995 .

[48]  C. Lave,et al.  THE DEMAND FOR URBAN MASS TRANSPORTATION , 1970 .

[49]  Hans-Jürgen Andreß,et al.  Analyse von Tabellen und kategorialen Daten : Log-lineare Modelle, latente Klassenanalyse, logistische Regression und GSK-Ansatz ; mit 32 Abbildungen und 67 Tabellen , 1997 .

[50]  J. Cramer,et al.  Mean and variance of R2 in small and moderate samples , 1987 .

[51]  R. Mansfield,et al.  Letters: Meta-Analysis of Research: A Rejoinder to Glass , 1977 .

[52]  K. Zimmermann,et al.  PSEUDO‐R2 MEASURES FOR SOME COMMON LIMITED DEPENDENT VARIABLE MODELS , 1996 .

[53]  N. Nagelkerke,et al.  A note on a general definition of the coefficient of determination , 1991 .

[54]  R. Peterson A Meta-Analysis of Variance Accounted for and Factor Loadings in Exploratory Factor Analysis , 2000 .

[55]  H. Steensma,et al.  International Market Entry by U.S. Internet Firms: An Empirical Analysis of Country Risk, National Culture, and Market Size , 2006 .