Correlation coefficient estimation involving a left censored laboratory assay variable

When assessing a correlation between two exposure or biological marker variables, one sometimes encounters the problem of indeterminate values for one of the variables due to an assay detection limit. In this event, investigators often report correlation coefficients computed after removing the pairs involving non-detectable values, or after substituting some small constant for those values. These ad hoc practices can lead to bias in both point and confidence interval estimates of the true correlation coefficient. To address this issue, we consider two parametric techniques for estimating the correlation in the presence of left censoring for one of the variables. The first is a maximum likelihood approach, and the second is an adaptation of multiple imputation motivated primarily by potential benefits in confidence interval coverage. Both of the estimators studied reduce to the standard Pearson's correlation coefficient in the event of no censoring, and hence are valid in cases where this measure would be appropriate for the complete data. We assess these approaches empirically and contrast them with ad hoc methods for estimating the correlation between cervicovaginal human immunodeficiency virus (HIV) viral load measurements and CD4+ lymphocyte counts from HIV positive women enrolled in a clinical trial conducted in Bangkok, Thailand.

[1]  A. Cohen,et al.  Simplified Estimators for the Normal Distribution When Samples Are Singly Censored or Truncated , 1959 .

[2]  F. Brun-Vézinet,et al.  HIV‐1 detection in cervicovaginal secretions during pregnancy , 1997, AIDS.

[3]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[4]  J. Karon,et al.  Short-course zidovudine for perinatal HIV-1 transmission in Bangkok, Thailand: a randomised controlled trial , 1999, The Lancet.

[5]  R. Hornung,et al.  Estimation of Average Concentration in the Presence of Nondetectable Values , 1990 .

[6]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[7]  Martin A. Tanner Response-Reader Reaction: A Note on the Analysis of Censored Regression Data by Multiple Imputation , 1995 .

[8]  L H Moulton,et al.  A mixture model with detection limits for regression analyses of antibody response to vaccine. , 1995, Biometrics.

[9]  R. Fisher FREQUENCY DISTRIBUTION OF THE VALUES OF THE CORRELATION COEFFIENTS IN SAMPLES FROM AN INDEFINITELY LARGE POPU;ATION , 1915 .

[10]  Robert H. Lyles,et al.  Random regression models for human immunodeficiency virus ribonucleic acid data subject to left censoring and informative drop‐outs , 2000 .

[11]  G. C. Wei,et al.  Applications of multiple imputation to the analysis of censored regression data. , 1991, Biometrics.

[12]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[13]  Robert W. Coombs,et al.  Longitudinal Analysis of Quantitative Virologic Measures in Human Immunodeficiency Virus-Infected Subjects with ⩾400 CD4 Lymphocytes: Implications for Applying Measurements to Individual Patients , 1997 .

[14]  J. Schafer Multiple imputation: a primer , 1999, Statistical methods in medical research.

[15]  Elisa T. Lee,et al.  Statistical Methods for Survival Data Analysis , 1994, IEEE Transactions on Reliability.

[16]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[17]  D. Kleinbaum,et al.  Applied Regression Analysis and Other Multivariate Methods , 1978 .

[18]  J. Karon,et al.  Short-course antenatal zidovudine reduces both cervicovaginal human immunodeficiency virus type 1 RNA levels and risk of perinatal transmission. Bangkok Collaborative Perinatal HIV Transmission Study Group. , 2000, The Journal of infectious diseases.

[19]  J. Hughes,et al.  Mixed Effects Models with Censored Data with Application to HIV RNA Levels , 1999, Biometrics.