Marginal Methods for Incomplete Longitudinal Data Arising in Clusters

Inverse probability–weighted generalized estimating equations are commonly used to deal with incomplete longitudinal data arising from a missing-at-random mechanism when the marginal means are of primary interest. In many cases, however, the repeated measurements themselves may arise in clusters, which leads to both a cross-sectional and a longitudinal correlation structure. In some applications, the degree of these types of correlation may become of scientific interest. Here we develop inverse probability–weighted second-order estimating equations for monotone missing-data patterns which, under specified assumptions, facilitate consistent estimation of the marginal mean parameters and association parameters. Here the missing-data model accommodates cross-sectional clustering in the missing-data indicators, and the probabilities are estimated under a multivariate Plackett model. For computational reasons, we also consider using the alternating logistic regression algorithm for estimation of the association parameters for the responses. We investigate the importance of modeling the cross-sectional clustering in the missing-data process by simulation. An extension to deal with intermittently missing data is provided, and an application to a longitudinal cluster-randomized smoking prevention trial is presented.

[1]  M. Pepe,et al.  A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data , 1994 .

[2]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[3]  G. Molenberghs,et al.  Marginal modelling of Correlated Ordinal Data using an n-way Plackett Distribution , 1992 .

[4]  James M. Robins,et al.  Semiparametric Regression for Repeated Outcomes With Nonignorable Nonresponse , 1998 .

[5]  SECOND ORDER ESTIMATING EQUATIONS FOR CLUSTERED LONGITUDINAL BINARY DATA WITH MISSING OBSERVATIONS , 2002 .

[6]  J. Robins,et al.  Estimation of the Causal Effect of a Time-Varying Exposure on the Marginal Mean of a Repeated Binary Outcome , 1999 .

[7]  Geert Molenberghs,et al.  Regression Models for Longitudinal Binary Responses with Informative Drop‐Outs , 1995 .

[8]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[9]  Michael Woodroofe,et al.  Estimation in Large Samples , 1988 .

[10]  S. Lipsitz,et al.  Generalized estimating equations for correlated binary data: Using the odds ratio as a measure of association , 1991 .

[11]  N M Laird,et al.  Missing data in longitudinal studies. , 1988, Statistics in medicine.

[12]  S. Zeger,et al.  Multivariate Regression Analyses for Categorical Data , 1992 .

[13]  P. Diggle,et al.  Modelling multivariate binary data with alternating logistic regressions , 1993 .

[14]  Semi-parametric estimation of models for the means and covariances in the presence of missing data , 1995 .

[15]  S. Kelder,et al.  Communitywide smoking prevention: long-term outcomes of the Minnesota Heart Health Program and the Class of 1989 Study. , 1992, American journal of public health.

[16]  A. Agresti,et al.  Simultaneously Modeling Joint and Marginal Distributions of Multivariate Categorical Responses , 1994 .

[17]  M. Edwardes,et al.  A randomized trial to evaluate the risk of gastrointestinal disease due to consumption of drinking water meeting current microbiological standards. , 1991, American journal of public health.

[18]  Stuart R. Lipsitz,et al.  A Model for Binary Time Series Data with Serial Odds Ratio Patterns , 1995 .

[19]  N. Laird,et al.  A likelihood-based method for analysing longitudinal binary responses , 1993 .

[20]  K. Brown,et al.  Effectiveness of a social influences smoking prevention program as a function of provider type, training method, and school risk. , 1999, American journal of public health.

[21]  R. Prentice,et al.  Correlated binary regression with covariates specific to each binary observation. , 1988, Biometrics.