Efficiency of reduced logistic regression models

One feature of the usual polychotomous logistic regression model for categorical outcomes is that a covariate must be included in all the regression equations. If a covariate is not important in all of them, the procedure will estimate unnecessary parameters. More flexible approaches allow different subsets of covariates in different regressions. One alternative uses individualized regressions which express the polychotomous model as a series of dichotomous models. Another uses a model in which a reduced set of parameters is simultaneously estimated for all the regressions. Large-sample efficiencies of these procedures were compared in a variety of circumstances in which there was a common baseline category for the outcome and the covariates were normally distributed. For a correctly specified model, the reduced estimates were over 100% efficient for nonzero slope parameters and up to 500% efficient when the baseline frequency and the effect of interest were small. The individualized estimates could have efficiencies less than 50% when the effect of interest was large, but were also up to 130% efficient when the baseline frequency was large and the effect of interest was small. Efficiency was usually enhanced by correlation among the covariates. For an underspecified reduced model, asymptotic bias in the reduced estimates was approximately proportional to the magnitude of the omitted parameter and to the reciprocal of the baseline frequency. Le modele habituel de regression logistique polychotomique pour des resultats categoriques necessite l'inclusion d'une covariable dans toutes les equations de regression. Si une covariable n'est pas importante dans toutes ces regressions, la procedure estime alors des parametres inutiles. Des approches plus flexibles permettent differents sous-ensembles de covariables dans differentes regressions. L'une des possibilites est I'emploi de regressions individuelles exprimant le modele polychotomique en une serie de modeles dichotomiques. Une autre possibilite est l'emploi d'un modele pour lequel un ensemble reduit de parametres est estime simultanement pour toutes les regressions. Les efficacites asymptotiques de ces procedures ont ete comparees dans un eventail de cas pour lesquels il y avait une categorie de base commune pour les resultats et les covariables suivaient une distribution gaussienne. Pour un modele correctement specifie, les estimes reduits etaient efficaces a plus de 100% pour les parametres de pente non nuls et presentaient une efficacite atteignant 500% lorsque la frequence de base et l'effet d'interět etaient petits. Les estimes individualises pouvaient avoir des efficacites inferieures a 50% lorsque l'effet d'interět etait important, mais pouvaient atteindre des efficacites de 130% lorsque la frequence de base etait grande et l'effet d'interět petit. L'efficacite etait generalement amelioree par la correlation entre les covariables. Pour un modele reduit sous-specifie, le biais asymptotique des estimes reduits etait approximativement proportionnel a la grandeur du parametre omis et a la reciproque de la frequence de base.

[1]  Stephen W. Lagakos,et al.  Loss in Efficiency Caused by Omitting Covariates and Misspecifying Exposure in Logistic Regression Models , 1993 .

[2]  Allan Donner,et al.  A characterization of the efficiency of individualized logistic regressions , 1993 .

[3]  N. Jewell,et al.  Some surprising results about covariate adjustment in logistic regression models , 1991 .

[4]  W W Hauck,et al.  A consequence of omitted covariates when estimating odds ratios. , 1991, Journal of clinical epidemiology.

[5]  Nanny Wermuth,et al.  An approximation to maximum likelihood estimates in reduced models , 1990 .

[6]  E. K. Harris,et al.  Multivariate Interpretation of Clinical Laboratory Data , 1987 .

[7]  B S Pasternack,et al.  Risk assessment for case-control subgroups by polychotomous logistic regression. , 1986, American journal of epidemiology.

[8]  Emmanuel Lesaffre,et al.  Multiple group logistic discrimination , 1986 .

[9]  J Siemiatycki,et al.  Statistical methods for relating several exposure factors to several diseases in case-heterogeneity studies. , 1986, Statistics in medicine.

[10]  R. J. Marshall,et al.  Hypothesis testing in the polychotomous logistic model with an application to detecting gastrointestinal cancer. , 1985, Statistics in medicine.

[11]  R. Gray,et al.  Calculation of polychotomous logistic regression parameters using individualized regressions , 1984 .

[12]  C B Begg,et al.  Methodology for the Differential Diagnosis of a Complex Data Set , 1983, Medical decision making : an international journal of the Society for Medical Decision Making.

[13]  G. Plomteux Multivariate analysis of an enzymic profile for the differential diagnosis of viral hepatitis. , 1980, Clinical chemistry.