Estimation for small normal data sets with detection limits.

When environmental phenomena are measured, the measuring devices/procedures used are unable to detect low concentrations. Thus, concentrations below certain threshold levels are not measurable. Standard “detection limits” are set by various agencies for various phenomena for various types of measuring devices. Measured values below these limits are reported as “below detection limit” or as “trace” and are thus not available for statistical analysis. (Sometimes values below these limits are available, but their accuracy is greatly in doubt.) Consequently, the statistician often has a very basic problem facing him: how does he analyze data sets that contain a reasonable percentage of “below detection limit” entries? Environmental data are characterized not only by detection limits but also by small sample size. Required measurements for compliance purposes often are performed annually, quarterly, or, at most, monthly due to the expense or disruption caused by the testing. Studies of pilot plants or demonstration plants are often of such short duration that 5-10 samples are all that are obtained. Thus, methods for estimating the parameters of environmental data using asymptotic or large sample size procedures are usually inapplicable. In summary, environmental data usually have the following characteristics which make it difficult to analyze: (1) The data are cut off from below by detection limits. (2) The sample size is very small. As an example, suppose we have taken eight samples of air near a chemical warehouse in order to see if there are leaks (fugitive emissions). Concentrations below 0.8 ppb, say, are below the reliability of the measurement procedure. Of the eight samples, suppose five are below the detection limit while the other three are measured to have concentrations of 1, 2, and 5 ppb. How do we find the average concentration? Surely the smallest value that the true average could be is (0 + 0 + 0 + 0 + 0 + 1 + 2 + 5)/8 = 1 ppb