Bias reduction in exponential family nonlinear models

In Firth (1993, Biometrika) it was shown how the leading term in the asymptotic bias of the maximum likelihood estimator is removed by adjusting the score vector, and that in canonical-link generalized linear models the method is equivalent to maximizing a penalized likelihood that is easily implemented via iterative adjustment of the data. Here a more general family of bias-reducing adjustments is developed for a broad class of univariate and multivariate generalized nonlinear models. The resulting formulae for the adjusted score vector are computationally convenient, and in univariate models they directly suggest implementation through an iterative scheme of data adjustment. For generalized linear models a necessary and sufficient condition is given for the existence of a penalized likelihood interpretation of the method. An illustrative application to the Goodman row-column association model shows how the computational simplicity and statistical benefits of bias reduction extend beyond generalized linear models.

[1]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 1. General Considerations , 1970 .

[2]  Leo A. Goodman,et al.  Association Models and Canonical Correlation in the Analysis of Cross-Classifications Having Ordered Categories , 1981 .

[3]  D. Cox,et al.  Asymptotic techniques for use in statistics , 1989 .

[4]  N. Breslow,et al.  Bias correction in generalised linear mixed models with a single component of dispersion , 1995 .

[5]  Thomas J. Santner,et al.  A note on A. Albert and J. A. Anderson's conditions for the existence of maximum likelihood estimates in logistic regression models , 1986 .

[6]  H. Cramér Mathematical methods of statistics , 1947 .

[7]  P. McCullagh,et al.  Bias Correction in Generalized Linear Models , 1991 .

[8]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[9]  T. A. Warm Weighted likelihood estimation of ability in item response theory , 1989 .

[10]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[11]  L. Brown,et al.  Interval Estimation for a Binomial Proportion , 2001 .

[12]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[13]  Dale J. Poirier,et al.  Jeffreys' prior for logit models , 1994 .

[14]  R. Schaefer Bias correction in maximum likelihood logistic regression. , 1985, Statistics in medicine.

[15]  N. Sartori Bias prevention of maximum likelihood estimates for scalar skew normal and skew t distributions , 2006 .

[16]  B. Woolf ON ESTIMATING THE RELATION BETWEEN BLOOD GROUP AND DISEASE , 1955, Annals of human genetics.

[17]  P. McCullagh Tensor Methods in Statistics , 1987 .

[18]  D. Cox,et al.  A General Definition of Residuals , 1968 .

[19]  A. Wald,et al.  On Stochastic Limit and Order Relationships , 1943 .

[20]  John J. Gart,et al.  The effect of bias, variance estimation, skewness and kurtosis of the empirical logit on weighted least squares analyses , 1985 .

[21]  M. H. Quenouille NOTES ON BIAS IN ESTIMATION , 1956 .

[22]  Chih-Ling Tsai,et al.  Bias in nonlinear regression , 1986 .

[23]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[24]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[25]  A Agresti,et al.  On Logit Confidence Intervals for the Odds Ratio with Small Samples , 1999, Biometrics.

[26]  D. Firth Bias reduction of maximum likelihood estimates , 1993 .

[27]  V. T. Farewell,et al.  Jackknife estimation with structured data , 1978 .

[28]  S. Bull,et al.  Confidence intervals for multinomial logistic regression in sparse data , 2007, Statistics in medicine.

[29]  Eric R. Ziegel,et al.  Multivariate Statistical Modelling Based on Generalized Linear Models , 2002, Technometrics.

[30]  John F. Monahan,et al.  Numerical Methods of Statistics: Contents , 2001 .

[31]  D. Goldfarb A family of variable-metric methods derived by variational means , 1970 .

[32]  Donald B. Rubin,et al.  Logit-Based Interval Estimation for Binomial Data Using the Jeffreys Prior , 1987 .

[33]  Geert Molenberghs,et al.  Estimation and software , 2004 .

[34]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[35]  I. Skovgaard A note on the differentiation of cumulants of log likelihood derivatives , 1986 .

[36]  Bo-Cheng Wei,et al.  Exponential Family Nonlinear Models , 1998 .

[37]  Celia M. T. Greenwood,et al.  A modified score function estimator for multinomial logistic regression in small samples , 2002 .

[38]  Luigi Pace,et al.  Principles of statistical inference : from a neo-Fisherian perspective , 1997 .

[39]  P. McCullagh Tensor notation and cumulants of polynomials , 1984 .

[40]  J. A. Anderson,et al.  Logistic Discrimination and Bias Correction in Maximum Likelihood Estimation , 1979 .

[41]  Herbert Hoijtink,et al.  On person parameter estimation in the dichotomous Rasch model , 1995 .

[42]  A. Albert,et al.  On the existence of maximum likelihood estimates in logistic regression models , 1984 .

[43]  David Firth,et al.  Bias reduction, the Jeffreys prior and GLIM , 1992 .

[44]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[45]  D. Shanno Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .

[46]  Ruggero Bellio,et al.  Structural Modeling of Measurement Error in Generalized Linear Models with Rasch Measures as Covariates , 2011 .

[47]  T. Yee The VGAM Package for Categorical Data Analysis , 2010 .

[48]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics , 1991 .

[49]  R. Fletcher,et al.  A New Approach to Variable Metric Algorithms , 1970, Comput. J..

[50]  M. Bartlett,et al.  APPROXIMATE CONFIDENCE INTERVALS , 1953 .

[51]  M. Bartlett,et al.  APPROXIMATE CONFIDENCE INTERVALSMORE THAN ONE UNKNOWN PARAMETER , 1953 .

[52]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[53]  B. Haldane THE ESTIMATION AND SIGNIFICANCE OF THE LOGARITHM OF A RATIO OF FREQUENCIES , 1956, Annals of human genetics.

[54]  M. Schemper,et al.  A solution to the problem of separation in logistic regression , 2002, Statistics in medicine.

[55]  Ivo W. Molenaar,et al.  Estimation of Item Parameters , 1995 .

[56]  Donald B. Rubin,et al.  Multiple Imputation of Industry and Occupation Codes in Census Public-use Samples Using Bayesian Logistic Regression , 1991 .

[57]  Juni Palmgren,et al.  The Fisher information matrix for log linear models arguing conditionally on observed explanatory variable , 1981 .

[58]  L. A. Goodman The Analysis of Cross-Classified Data Having Ordered and/or Unordered Categories: Association Models, Correlation Models, and Asymmetry Models for Contingency Tables With or Without Missing Entries , 1985 .

[59]  David Firth,et al.  1. Overcoming the Reference Category Problem in the Presentation of Statistical Models , 2003 .

[60]  Christopher Zorn,et al.  A Solution to Separation in Binary Response Models , 2005, Political Analysis.

[61]  M Schemper,et al.  A Solution to the Problem of Monotone Likelihood in Cox Regression , 2001, Biometrics.

[62]  David R. Cox The analysis of binary data , 1970 .

[63]  P. McCullagh,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[64]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[65]  H. W. Peers,et al.  Asymptotic Expansions for Confidence Limits in the Presence of Nuisance Parameters, with Applications , 1985 .

[66]  L. A. Goodman Simple Models for the Analysis of Association in Cross-Classifications Having Ordered Categories , 1979 .

[67]  J. Gart Alternative Analyses of Contingency Tables , 1966 .

[68]  J. Copas Binary Regression Models for Contaminated Data , 1988 .

[69]  W W Hauck,et al.  Jackknife bias reduction for polychotomous logistic regression. , 1997, Statistics in medicine.

[70]  Emmanuel Lesaffre,et al.  Partial Separation in Logistic Discrimination , 1989 .

[71]  John A. Nelder,et al.  Generalized linear models. 2nd ed. , 1993 .

[72]  B. Efron Defining the Curvature of a Statistical Problem (with Applications to Second Order Efficiency) , 1975 .

[73]  Yadollah Mehrabi,et al.  LIKELIHOOD-BASED METHODS FOR BIAS REDUCTION IN LIMITING DILUTION ASSAYS , 1995 .

[74]  A. N. Pettitt,et al.  BIAS CORRECTION FOR CENSORED DATA WITH EXPONENTIAL LIFETIMES , 1998 .

[75]  D. Firth Generalized Linear Models and Jeffreys Priors: An Iterative Weighted Least-Squares Approach , 1992 .

[76]  On the first-order bias of parameter estimates in a quantal response model under alternative estimation procedures , 1972 .