Validity and reliability of exposure assessors' ratings of exposure intensity by type of occupational questionnaire and type of rater.

BACKGROUND In epidemiologic studies that rely on professional judgment to assess occupational exposures, the raters' accurate assessment is vital to detect associations. We examined the influence of the type of questionnaire, type of industry, and type of rater on the raters' ability to reliably and validly assess within-industry differences in exposure. Our aim was to identify areas where improvements in exposure assessment may be possible. METHODS Subjects from three foundries (n = 72) and three textile plants (n = 74) in Shanghai, China, completed an occupational history (OH) and an industry-specific questionnaire (IQ). Six total dust measurements were collected per subject and were used to calculate a subject-specific measurement mean, which was used as the gold standard. Six raters independently ranked the intensity of each subject's current job on an ordinal scale (1-4) based on the OH alone and on the OH and IQ together. Aggregate ratings were calculated for the group, for industrial hygienists, and for occupational physicians. We calculated intra-class correlation coefficients (ICCs) to evaluate the reliability of the raters. We calculated the correlation between the subject-specific measurement means and the ratings to evaluate the raters' validity. Analyses were stratified by industry, type of questionnaire, and type of rater. We also examined the agreement between the ratings by exposure category, where the subject-specific measurement means were categorized into two and four categories. RESULTS The reliability and validity measures were higher for the aggregate ratings than for the ratings from the individual raters. The group's performance was maximized with three raters. Both the reliability and validity measures were higher for the foundry industry than for the textile industry. The ICCs were consistently lower in the OH/IQ round than in the OH round in both industries. In contrast, the correlations with the measurement means were higher in the OH/IQ round than in the OH round for the foundry industry (group rating, OH/IQ: Spearman rho = 0.77; OH: rho = 0.64). No pattern by questionnaire type was observed for the textile industry (group rating, Spearman rho = 0.50, both assessment rounds). For both industries, the agreement by exposure category was higher when the task was reduced to discriminating between two versus four exposure categories. CONCLUSIONS Assessments based on professional judgment may reduce misclassification by using two or three raters, by using questionnaires that systematically collect task information, and by defining intensity categories that are distinguishable by the raters. However, few studies have the resources to use multiple raters and these additional efforts may not be adequate for obtaining valid subjective ratings. Thus, improving exposure assessment approaches for studies that rely on professional judgment remain an important research need.

[1]  John S. Evans,et al.  Subjective Estimation of Toluene Exposures: A Calibration Study of Industrial Hygienists , 1989 .

[2]  H Kromhout,et al.  Assessment of the sensitivity of the relation between current exposure to carbon black and lung function parameters when using different grouping schemes. , 1999, American journal of industrial medicine.

[3]  H Kromhout,et al.  Individual-based and group-based occupational exposure assessment: some equations to evaluate different strategies. , 1998, The Annals of occupational hygiene.

[4]  G. Benke,et al.  Retrospective assessment of occupational exposure to chemicals in community-based studies: validity and repeatability of industrial hygiene panel ratings. , 1997, International journal of epidemiology.

[5]  J. Siemiatycki,et al.  Validation of expert assessment of occupational exposures. , 2003, American journal of industrial medicine.

[6]  Clyde Hertzman,et al.  Impact of the Specificity of the Exposure Metric on Exposure–Response Relationships , 2007, Epidemiology.

[7]  James Surowiecki The wisdom of crowds: Why the many are smarter than the few and how collective wisdom shapes business, economies, societies, and nations Doubleday Books. , 2004 .

[8]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[9]  Mustafa Dosemeci,et al.  Inter-rater agreement of assessed prenatal maternal occupational exposures to lead. , 2006, Birth defects research. Part A, Clinical and molecular teratology.

[10]  H. Kromhout,et al.  Experts' subjective assessment of pesticide exposure in fruit growing. , 1996, Scandinavian journal of work, environment & health.

[11]  H Kromhout,et al.  Assessment of occupational exposures in a general population: comparison of different methods. , 1999, Occupational and environmental medicine.

[12]  S. Semple,et al.  A training exercise in subjectively estimating inhalation exposures. , 2001, Scandinavian journal of work, environment & health.

[13]  K Teschke,et al.  Reliability of retrospective chlorophenol exposure estimates over five decades. , 1996, American journal of industrial medicine.

[14]  Tony Fletcher,et al.  Assessing Exposure Misclassification by Expert Assessment in Multicenter Occupational Studies , 2003, Epidemiology.

[15]  J Siemiatycki,et al.  Discovering carcinogens in the occupational environment: a novel epidemiologic approach. , 1981, Journal of the National Cancer Institute.

[16]  W. Willett,et al.  Misinterpretation and misuse of the kappa statistic. , 1987, American journal of epidemiology.

[17]  H. Kromhout,et al.  Semiquantitative Estimates of Exposure to Methylene Chloride and Styrene: The Influence of Quantitative Exposure Data , 1991 .

[18]  P A Stewart,et al.  Questionnaires for collecting detailed occupational information for community-based case control studies. , 1998, American Industrial Hygiene Association journal.

[19]  David Kriebel,et al.  Research Methods in Occupational Epidemiology , 1989 .

[20]  H. Kromhout,et al.  Agreement between qualitative exposure estimates and quantitative exposure measurements. , 1987, American journal of industrial medicine.

[21]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[22]  L. Kupper,et al.  Individual-based and group-based occupational exposure assessment: some equations to evaluate different strategies. , 1998, The Annals of occupational hygiene.

[23]  J J Spinelli,et al.  Validation of a semi-quantitative job exposure matrix at a Söderberg aluminum smelter. , 2003, The Annals of occupational hygiene.

[24]  P A Stewart,et al.  Comparison of industrial hygienists' exposure evaluations for an epidemiologic study. , 2000, Scandinavian journal of work, environment & health.

[25]  Gurumurthy Ramachandran,et al.  Occupational exposure decisions: can limited data interpretation training help improve accuracy? , 2009, The Annals of occupational hygiene.

[26]  H. Kromhout,et al.  Inter-rater agreement in the assessment of exposure to carcinogens in the offshore petroleum industry , 2007, Occupational and Environmental Medicine.

[27]  A. Olshan,et al.  Occupational exposure assessment in case–control studies: opportunities for improvement , 2002, Occupational and environmental medicine.

[28]  K Teschke,et al.  Validity and reliability of a method for retrospective evaluation of chlorophenate exposure in the lumber industry. , 1988, American journal of industrial medicine.

[29]  H Kromhout,et al.  Efficiency of different grouping schemes for dust exposure in the European carbon black respiratory morbidity study. , 1997, Occupational and environmental medicine.

[30]  S Wacholder,et al.  Validation studies using an alloyed gold standard. , 1993, American journal of epidemiology.