Evaluation of regression methods when immunological measurements are constrained by detection limits

BackgroundThe statistical analysis of immunological data may be complicated because precise quantitative levels cannot always be determined. Values below a given detection limit may not be observed (nondetects), and data with nondetects are called left-censored. Since nondetects cannot be considered as missing at random, a statistician faced with data containing these nondetects must decide how to combine nondetects with detects. Till now, the common practice is to impute each nondetect with a single value such as a half of the detection limit, and to conduct ordinary regression analysis. The first aim of this paper is to give an overview of methods to analyze, and to provide new methods handling censored data other than an (ordinary) linear regression. The second aim is to compare these methods by simulation studies based on real data.ResultsWe compared six new and existing methods: deletion of nondetects, single substitution, extrapolation by regression on order statistics, multiple imputation using maximum likelihood estimation, tobit regression, and logistic regression. The deletion and extrapolation by regression on order statistics methods gave biased parameter estimates. The single substitution method underestimated variances, and logistic regression suffered loss of power. Based on simulation studies, we found that tobit regression performed well when the proportion of nondetects was less than 30%, and that taken together the multiple imputation method performed best.ConclusionBased on simulation studies, the newly developed multiple imputation method performed consistently well under different scenarios of various proportion of nondetects, sample sizes and even in the presence of heteroscedastic errors.

[1]  J. Tobin Estimation of Relationships for Limited Dependent Variables , 1958 .

[2]  A. L. Koch,et al.  The logarithm in biology. 1. Mechanisms generating the log-normal distribution exactly. , 1966, Journal of theoretical biology.

[3]  A. L. Koch,et al.  The logarithm in biology. II. Distributions simulating the log-normal. , 1969, Journal of theoretical biology.

[4]  P. Schmidt,et al.  An Investigation of the Robustness of the Tobit Estimator to Non-Normality , 1982 .

[5]  G. Maddala Limited-dependent and qualitative variables in econometrics: Introduction , 1983 .

[6]  P. Schmidt,et al.  Limited-Dependent and Qualitative Variables in Econometrics. , 1984 .

[7]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[8]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[9]  D. Rubin,et al.  Multiple Imputation for Nonresponse in Surveys , 1989 .

[10]  R. Hornung,et al.  Estimation of Average Concentration in the Presence of Nondetectable Values , 1990 .

[11]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[12]  D B Rubin,et al.  Multiple Imputation for Multivariate Data with Missing and Below‐Threshold Measurements: Time‐Series Concentrations of Pollutants in the Arctic , 2001, Biometrics.

[13]  W. Stahel,et al.  Log-normal Distributions across the Sciences: Keys and Clues , 2001 .

[14]  Ian R White,et al.  The use of regression models for medians when observed outcomes may be modified by interventions , 2003, Statistics in medicine.

[15]  A. Ciampi,et al.  Effects of exposure measurement error when an exposure variable is constrained by a lower limit. , 2003, American journal of epidemiology.

[16]  M. Escobar,et al.  The use of the Tobit model for analyzing measures of health status , 2004, Quality of Life Research.

[17]  J. Cerhan,et al.  Epidemiologic evaluation of measurement data in the presence of detection limits. , 2005, Environmental health perspectives.

[18]  D. Helsel Nondetects and data analysis : statistics for censored environmental data , 2005 .

[19]  R. Koenker Quantile Regression: Name Index , 2005 .

[20]  Lopaka Lee,et al.  Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics , 2005, Comput. Geosci..

[21]  Jose Ignacio Santos,et al.  Vitamin A supplementation reduces the monocyte chemoattractant protein-1 intestinal immune response of Mexican children. , 2006, The Journal of nutrition.

[22]  Frits R Rosendaal,et al.  Inflammatory Cytokines as Risk Factors for a First Venous Thrombosis: A Prospective Population-Based Study , 2006, PLoS medicine.

[23]  Ingo Ruczinski,et al.  Imputation Methods to Improve Inference in Snp Association Studies , 2022 .

[24]  Enrique F Schisterman,et al.  The limitations due to exposure detection limits for regression models. , 2006, American journal of epidemiology.

[25]  W. Hop,et al.  Multiplex Bead Array Assay for Detection of 25 Soluble Cytokines in Blister Fluid of Patients with Complex Regional Pain Syndrome Type 1 , 2006, Mediators of inflammation.

[26]  Laura C Rodrigues,et al.  A guide to modern statistical analysis of immunological data , 2007, BMC Immunology.

[27]  Maria Yazdanbakhsh,et al.  Effect of high-dose vitamin A supplementation on the immune response to Bacille Calmette-Guerin vaccine. , 2007, The American journal of clinical nutrition.

[28]  Paul H. C. Eilers,et al.  Ill-posed problems with counts, the composite link model and penalized likelihood , 2007 .

[29]  C. Pipper,et al.  [''R"--project for statistical computing]. , 2008, Ugeskrift for laeger.