Statistical Methods for Estimating Within-Cluster Effects for Clustered Poisson Data

Clustered Poisson data frequently appear in medical research. Interest often focuses on examination of an exposure effect within clusters. The objective of this paper is to compare the performance of six methods for estimating the exposure effect for clustered Poisson data: 1) independent Poisson; 2) fixed cluster effects Poisson; 3) conditional likelihood Poisson estimation; 4) Generalized Estimating Equations (GEE); 5) random cluster effects Poisson; and 6) random cluster effects Poisson, with separate between- and within-cluster effects. Biases and standard errors of within- cluster exposure effects are compared across the six statistical methods considering constant or varying exposure ratio (number of exposed to unexposed subjects), constant or varying cluster sizes, different within-cluster exposure effect, different cluster variances, and number of clusters. Simulations and theoretical results show that exposure ratio is a key quantity. With constant exposure ratio designs, maximum likelihood estimates and asymptotic standard errors were obtained in closed form. All models, except GEE, give equivalent estimates and standard errors of the within-cluster  exposure effect. With varying exposure ratio designs, conditional likelihood and fixed cluster effects methods yield the same estimates and standard errors for the exposure effect. Results from the random cluster effects Poisson model with separate between- and within-cluster effects are very similar to those from fixed cluster effects Poisson and conditional Poisson methods. We applied the above approaches to birth cohort data, to analyze incidence of Respiratory Syncytial Virus (RSV) infection in young children in Indonesia.

[1]  J. D. Kalbfleisch,et al.  Conditions for consistent estimation in mixed-effects models for binary matched-pairs data† , 1994 .

[2]  Moonseong Heo,et al.  Comparison of statistical methods for analysis of clustered binary observations , 2005, Statistics in medicine.

[3]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[4]  Youngjo Lee Fixed‐effect versus random‐effect models for evaluating therapeutic preferences , 2002, Statistics in Medicine.

[5]  J. Deddens,et al.  Effects of omitting a covariate in poisson models when the data are balanced , 2000 .

[6]  M. Gail The Effect of Pooling Across Strata in Perfectly Balanced Studies , 1988 .

[7]  J. Kalbfleisch,et al.  Between- and within-cluster covariate effects in the analysis of clustered data. , 1998, Biometrics.

[8]  Geert Verbeke,et al.  The Linear Mixed Model. A Critical Investigation in the Context of Longitudinal Data. , 1997 .

[9]  A. Localio,et al.  Adjustments for Center in Multicenter Studies: An Overview , 2001, Annals of Internal Medicine.

[10]  Youngjo Lee,et al.  HGLM versus conditional estimators for the analysis of clustered binary data , 2005, Statistics in medicine.

[11]  J. Neuhaus Estimation efficiency and tests of covariate effects with clustered binary data. , 1993, Biometrics.

[12]  S. Ratcliffe,et al.  Deviations from the population-averaged versus cluster-specific relationship for clustered binary data , 2004, Statistical methods in medical research.

[13]  T R Ten Have,et al.  An Empirical Comparison of Several Clustered Data Approaches Under Confounding Due to Cluster Effects in the Analysis of Complications of Coronary Angioplasty , 1999, Biometrics.

[14]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[15]  V. Farewell,et al.  Regression analysis of overdispersed correlated count data with subject specific covariates , 2005, Statistics in medicine.

[16]  P. Diggle Analysis of Longitudinal Data , 1995 .

[17]  J. Kalbfleisch,et al.  The effects of mixture distribution misspecification when fitting mixed-effects logistic models , 1992 .

[18]  C P Farrington,et al.  Relative incidence estimation from case series for vaccine safety evaluation. , 1995, Biometrics.

[19]  Peter C Austin,et al.  A critical appraisal of propensity‐score matching in the medical literature between 1996 and 2003 , 2008, Statistics in medicine.

[20]  J. T. Wulu,et al.  Regression analysis of count data , 2002 .

[21]  G. Guo Negative Multinomial Regression Models for Clustered Event Counts , 1996 .

[22]  E. B. Andersen,et al.  Asymptotic Properties of Conditional Maximum‐Likelihood Estimators , 1970 .

[23]  U. Grömping A note on fitting a marginal model to mixed effects log-linear regression data via GEE. , 1996, Biometrics.

[24]  E. Demidenko Poisson Regression for Clustered Data , 2007 .