Evaluating Left-Censored Data Through Substitution, Parametric, Semi-parametric, and Nonparametric Methods: A Simulation Study

Abstract In this study, an attempt was made to determine the degrees of bias in particular sampling sizes and methods. The aim of the study was to determine deviations from the median, the mean, and the standard deviation (SD) in different sample sizes and at different censoring rates for log-normal, exponential, and Weibull distributions in the case of full and censored data sampling. Thus, the concept of “censoring” and censoring types was handled in the first place. Then substitution, parametric (MLE), nonparametric (KM), and semi-parametric (ROS) methods were introduced for the evaluation of left-censored observations. Within the scope of the present study, the data were produced uncensored based on the different parameters of each distribution. Then the datasets were left-censored at the ratios of 5, 25, 45, and 65 %. The censored data were estimated through substitution (LOD and LOD/$$\sqrt{2}$$2), parametric (MLE), semi-parametric (ROS), and nonparametric (KM) methods. In addition, evaluation was made by increasing the sample size from 20 to 300 by tens. Performance comparison was made between the uncensored dataset and the censored dataset on the basis of deviations from the median, the mean, and the SD. The results of simulation studies show that LOD/$$\sqrt{2}$$2 and ROS methods give better results than other methods in deviation from the mean in different sample sizes and at different censoring rates, while ROS gives better results than other methods in deviation from the median in almost all sample sizes and at almost all censoring rates.

[1]  P. Bertail,et al.  Statistical methodology to evaluate food exposure to a contaminant and influence of sanitary limits: application to Ochratoxin A. , 2004, Regulatory toxicology and pharmacology : RTP.

[2]  A. Birman,et al.  An evaluation of the , 1982 .

[3]  May,et al.  [Wiley Series in Probability and Statistics] Applied Survival Analysis (Regression Modeling of Time-to-Event Data) || Extensions of the Proportional Hazards Model , 2008 .

[4]  R. Hornung,et al.  Estimation of Average Concentration in the Presence of Nondetectable Values , 1990 .

[5]  F. O. Hoffman,et al.  Difficulties with the lognormal model in mean estimation and testing , 1996, Environmental and Ecological Statistics.

[6]  Lopaka Lee,et al.  Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics , 2005, Comput. Geosci..

[7]  D. Verma,et al.  Exposure estimation in the presence of nondetectable values: another look. , 2001, AIHAJ : a journal for the science of occupational and environmental health and safety.

[8]  D. Glass,et al.  Estimating mean exposures from censored data: exposure to benzene in the Australian petroleum industry. , 2001, The Annals of occupational hygiene.

[9]  Timothy A. Cohn,et al.  Estimation of descriptive statistics for multiply censored water quality data , 1988 .

[10]  Abdel H. El-Shaarawi,et al.  Replacement of censored observations by a constant: An evaluation , 1992 .

[11]  R. Fisher,et al.  On the Mathematical Foundations of Theoretical Statistics , 1922 .

[12]  J. Mulhausen,et al.  A strategy for assessing and managing occupational exposures , 1998 .

[13]  P. V. Rao,et al.  Applied Survival Analysis: Regression Modeling of Time to Event Data , 2000 .

[14]  R. O N A L,et al.  Evaluation of Statistical Treatments of Left-Censored Environmental Data using Coincident Uncensored Data Sets : I . Summary Statistics , 2008 .

[15]  Robert J. Gilliom,et al.  Estimation of Distributional Parameters for Censored Trace Level Water Quality Data: 1. Estimation Techniques , 1986 .

[16]  P. Hewett,et al.  A comparison of several methods for analyzing censored data. , 2007, The Annals of occupational hygiene.

[17]  J. Miller,et al.  Statistics for Analytical Chemistry , 1993 .

[18]  F. McNeill,et al.  Random left censoring: a second look at bone lead concentration measurements , 2007, Physics in medicine and biology.

[19]  E L Kaplan NON-PARAMETRIC ESTIMATION FROM INCOMPLETE OBSERVATION , 1958 .

[20]  D L Demets,et al.  Reanalysis of some baboon descent data. , 1976, Biometrics.

[21]  Jery R. Stedinger,et al.  Estimation of Moments and Quantiles using Censored Data , 1996 .

[22]  R. Antweiler,et al.  Evaluation of statistical treatments of left-censored environmental data using coincident uncensored data sets: I. Summary statistics. , 2008, Environmental science & technology.

[23]  Nian She,et al.  ANALYZING CENSORED WATER QUALITY DATA USING A NON‐PARAMETRIC APPROACH 1 , 1997 .

[24]  Masoud Kayhanian,et al.  Statistical approaches to estimating mean water quality concentrations with detection limits. , 2002, Environmental science & technology.