O4E.2 Efficiency of autocoding programs for converting job descriptors into standard occupational classification codes

Introduction Standard Occupational Classification (SOC) codes can link work exposure data to individual health outcomes, but manually assigning job codes is laborious. We tested two recently developed automatic coding programs. Methods We entered self-reported job titles and industry from two existing cohorts into two publicly available autocoding programs, the NIOSH Industry and Occupation Computerized Coding System (NIOCCS) and the Standardized Occupation Coding for Computer-assisted Epidemiological Research (SOCcer), and assessed agreement between autocodes and manual coding. We also assessed agreement of several exposure values (from the Occupational Information Network, O*NET) linked by manual SOC codes versus those linked by autocodes, in order to examine how differences in coding might affect exposure assignments in general population cohort studies. Results NIOCCS produced SOC codes for the majority of subjects (Cohort 1: 85%; Cohort 2: 79%). The level of detail for these codes varied slightly; 6-digit SOC codes (detailed occupations) were available for 84% and 76% of cohorts A and B respectively. Comparison to manual codes showed strong agreement at the major group level (kappa=0.8 for both cohorts) and weaker agreement at the 6-digit level (kappa ≥0.4 and 0.6). SOCcer produced 6-digit SOC codes for all subjects with good agreement at the 2-digit level (kappa ≥0.6 and 0.7) and slightly lower at the 6-digit level (kappa ≥0.3 and 0.4). Agreement for O*NET exposures was very high for most comparisons within both cohorts for both programs (many ICCs>0.8). Conclusion Both autocoding programs can be reliable tools to aid in assigning SOC codes that represent broad industry levels, with less agreement at finer levels of job codes. Given the availability of large public datasets with job information but no other work exposure data, autocoding of jobs provides exciting opportunities for analyzing work-related health outcomes in future studies.