Effect of outliers and nonhealthy individuals on reference interval estimation.

BACKGROUND Improvement in reference interval estimation using a new outlier detection technique, even with a physician-determined healthy sample, is examined. The effect of including physician-determined nonhealthy individuals in the sample is evaluated. METHODS Traditional data transformation coupled with robust and exploratory outlier detection methodology were used in conjunction with various reference interval determination techniques. A simulation study was used to examine the effects of outliers on known reference intervals. Physician-defined healthy groups with and without nonhealthy individuals were compared on real data. RESULTS With 5% outliers in simulated samples, the described outlier detection techniques had narrower reference intervals. Application of the technique to real data provided reference intervals that were, on average, 10% narrower than those obtained when outlier detection was not used. Only 1.6% of the samples were identified as outliers and removed from reference interval determination in both the healthy and combined samples. CONCLUSIONS Even in healthy samples, outliers may exist. Combining traditional and robust statistical techniques provide a good method of identifying outliers in a reference interval setting. Laboratories in general do not have a well-defined healthy group from which to compute reference intervals. The effect of nonhealthy individuals in the computation increases reference interval width by approximately 10%. However, there is a large deviation among analytes.

[1]  D. Seligson,et al.  Clinical Chemistry , 1965, Bulletin de la Societe de chimie biologique.

[2]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[3]  R. Borth,et al.  Quality control in clinical chemistry. Part 2. Assessment of analytical methods for routine use. , 1976 .

[4]  Frank E. Harrell,et al.  A new distribution-free quantile estimator , 1982 .

[5]  H E Solberg,et al.  The theory of reference values Part 5. Statistical treatment of collected reference values. Determination of reference limits. , 1983, Journal of clinical chemistry and clinical biochemistry. Zeitschrift fur klinische Chemie und klinische Biochemie.

[6]  L. Kaplan,et al.  Clinical Chemistry: Theory, Analysis, and Correlation , 1984 .

[7]  K. Maurer,et al.  Third national health and nutrition examination survey , 1985 .

[8]  J. Tukey,et al.  Performance of Some Resistant Rules for Outlier Labeling , 1986 .

[9]  H. E. Solberg Approved recommendation (1987) on the theory of reference values. Part 5. Statistical treatment of collected reference values. Determination of reference limits , 1987 .

[10]  M. Braga,et al.  Exploratory Data Analysis , 2018, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[11]  Paul S. Horn,et al.  A Biweight Prediction Interval for Random Samples , 1988 .

[12]  Paul S. Horn,et al.  Robust quantile estimators for skewed populations , 1990 .

[13]  A J Pesce,et al.  The medical heritage concept: a model for assuring comparable laboratory results in long-term longitudinal studies. , 1992, Annals of clinical and laboratory science.

[14]  P. S. Horn,et al.  A robust approach to reference interval estimation and evaluation. , 1998, Clinical chemistry.

[15]  P S Horn,et al.  Reference interval computation using robust vs parametric and nonparametric analyses. , 1999, Clinical chemistry.