Regression-based reference limits: determination of sufficient sample size.

Regression analysis is the method of choice for the production of covariate-dependent reference limits. There are currently no recommendations on what sample size should be used when regression-based reference limits and confidence intervals are calculated. In this study we used Monte Carlo simulation to study a reference sample group of 374 age-dependent hemoglobin values. From this sample, 5000 random subsamples, with replacement, were constructed with 10-220 observations per sample. Regression analysis was used to estimate age-dependent 95% reference intervals for hemoglobin concentrations and erythrocyte counts. The maximum difference between mean values of the root mean square error and original values for hemoglobin was 0.05 g/L when the sample size was > or = 60. The parameter estimators and width of reference intervals changed negligibly from the values calculated from the original sample regardless of what sample size was used. SDs and CVs for these factors changed rapidly up to a sample size of 30; after that changes were smaller. The largest and smallest absolute differences in root mean square error and width of reference interval between sample values and values calculated from the original sample were also evaluated. As expected, differences were largest in small sample sizes, and as sample size increased differences decreased. To obtain appropriate reference limits and confidence intervals, we propose the following scheme: (a) check whether the assumptions of regression analysis can be fulfilled with/without transformation of data; (b) check that the value of v, which describes how the covariate value is situated in relation to both the mean value and the spread of the covariate values, does not exceed 0.1 at minimum and maximum covariate positions; and (c) if steps 1 and 2 can be accepted, the reference limits with confidence intervals can be produced by regression analysis, and the minimum acceptable sample size will be approximately 70.

[1]  M. Healy,et al.  Distribution-free estimation of age-related centiles. , 1988, Annals of human biology.

[2]  E. Bruck,et al.  National Committee for Clinical Laboratory Standards. , 1980, Pediatrics.

[3]  S. Weisberg,et al.  Residuals and Influence in Regression , 1982 .

[4]  M. Burritt,et al.  Pediatric reference intervals for 19 biologic variables in healthy children. , 1990, Mayo Clinic proceedings.

[5]  P Royston,et al.  Constructing time-specific reference ranges. , 1991, Statistics in medicine.

[6]  K. Irjala,et al.  Reference intervals for immunoglobulins IgA, IgG and IgM in serum in adults and in children aged 6 months to 14 years. , 1990, Scandinavian journal of clinical and laboratory investigation.

[7]  K Irjala,et al.  Regression-based reference limits and their reliability: example on hemoglobin during the first year of life. , 1998, Clinical chemistry.

[8]  J. Oesterling,et al.  Age-specific reference ranges for serum prostate-specific antigen. , 1995, The Canadian journal of urology.

[9]  L. Elveback,et al.  STATISTICAL METHODS OF ESTIMATING PERCENTILES , 1969 .

[10]  Kerttu Irjala,et al.  Generation of Reference Values for Cardiac Enzymes from Hospital Admission Laboratory Data , 1994, European journal of clinical chemistry and clinical biochemistry : journal of the Forum of European Clinical Chemistry Societies.

[11]  T J Cole,et al.  Smoothing reference centile curves: the LMS method and penalized likelihood. , 1992, Statistics in medicine.

[12]  R. Borth,et al.  Quality control in clinical chemistry. Part 2. Assessment of analytical methods for routine use. , 1976 .

[13]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[14]  T Koivula,et al.  Reference intervals developed from data for hospitalized patients: computerized method based on combination of laboratory and diagnostic data. , 1994, Clinical chemistry.

[15]  D. Altman Construction of age-related reference centiles using absolute residuals. , 1993, Statistics in medicine.

[16]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[17]  James C. Boyd,et al.  Statistical Bases of Reference Values in Laboratory Medicine , 1995 .

[18]  T. Cole Fitting Smoothed Centile Curves to Reference Data , 1988 .

[19]  Calculating centile curves using kernel density estimation methods with application to infant kidney lengths. , 1991, Statistics in medicine.