Using a Box–Cox transformation in the analysis of longitudinal data with incomplete responses

Summary. We analyse longitudinal data on CD4 cell counts from patients who participated in clinical trials that compared two therapeutic treatments: zidovudine and didanosine. The investigators were interested in modelling the CD4 cell count as a function of treatment, age at base-line and disease stage at base-line. Serious concerns can be raised about the normality assumption of CD4 cell counts that is implicit in many methods and therefore an analysis may have to start with a transformation. Instead of assuming that we know the transformation (e.g. logarithmic) that makes the outcome normal and linearly related to the covariates, we estimate the transformation, by using maximum likelihood, within the Box-Cox family. There has been considerable work on the Box-Cox transformation for univariate regression models. Here, we discuss the Box-Cox transformation for longitudinal regression models when the outcome can be missing over time, and we also implement a maximization method for the likelihood, assuming that the missing data are missing at random.

[1]  D. Hinkley,et al.  The Analysis of Transformed Data , 1984 .

[2]  Victor DeGruttola,et al.  Modeling The Relationship Between Progression Of CD4-Lymphocyte Count And Survival Time , 1992 .

[3]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[4]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data , 1980 .

[5]  G. Box An analysis of transformations (with discussion) , 1964 .

[6]  Changbao Wu,et al.  Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis , 1986 .

[7]  G Molenberghs,et al.  Patterns of opportunistic infections in patients with HIV infection. , 1996, Journal of acquired immune deficiency syndromes and human retrovirology : official publication of the International Retrovirology Association.

[8]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[9]  N M Laird,et al.  Missing data in longitudinal studies. , 1988, Statistics in medicine.

[10]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[11]  J. O. Rawlings,et al.  Applied Regression Analysis: A Research Tool , 1988 .

[12]  B. Efron,et al.  Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information , 1978 .

[13]  Asit P. Basu,et al.  Aspects of Statistical Inference , 1996, Technometrics.

[14]  S. Weisberg,et al.  Assessing influence in multiple linear regression with incomplete data , 1986 .

[15]  T. Cole Fitting Smoothed Centile Curves to Reference Data , 1988 .

[16]  S R Lipsitz,et al.  Jackknife estimators of variance for parameter estimates from estimating equations with applications to clustered survival data. , 1994, Biometrics.