Classifying Open-Ended Reports: Factors Affecting the Reliability of Occupation Codes

Abstract A source of survey processing error that has received insufficient study to date is the misclassification of open-ended responses. We report on efforts to understand the misclassification of open occupation descriptions in the Current Population Survey (CPS). We analyzed double-coded CPS descriptions to identify which features vary with intercoder reliability. One factor strongly related to reliability was the length of the occupation description: longer descriptions were less reliably coded than shorter ones. This effect was stronger for particular occupation terms. We then carried out an experiment to examine the joint effects of description length and classification “difficulty” of particular occupation terms. For easy occupation terms longer descriptions were less reliably coded, but for difficult occupation terms longer descriptions were slightly more reliably coded than short descriptions. Finally, we observed as coders provided verbal reports on their decision making. One practice, evident in coders’ verbal reports, is their use of informal coding rules based on superficial features of the description. Such rules are likely to promote reliability, though not necessarily validity, of coding. To the extent that coders use informal rules for long descriptions involving difficult terms, this could help explain the observed relationship between description length and difficulty of coding particular terms.

[1]  E. Symanski,et al.  A comprehensive evaluation of within- and between-worker components of occupational exposure to chemical agents. , 1993, The Annals of occupational hygiene.

[2]  Seymour Sudman,et al.  Thinking about Answers: The Application of Cognitive Process to Survey Methodology , 1996 .

[3]  K. A. Ericsson,et al.  Protocol Analysis: Verbal Reports as Data , 1984 .

[4]  Patrick Sturgis,et al.  The Effect of Coding Error on Time Use Surveys Estimates , 2004 .

[5]  Giuseppe Moscarini,et al.  Occupational and Job Mobility in the US , 2006 .

[6]  Tony Hak,et al.  Coder training: Theoretical training or practical socialization? , 1996 .

[7]  W. Ahrens,et al.  Occupational exposure to carcinogens in the European Union , 2000, Occupational and environmental medicine.

[8]  Pamela Campanelli,et al.  The Quality of Occupational Coding in the United Kingdom , 1997 .

[9]  J. Ockene,et al.  Occupational exposure to environmental tobacco smoke. , 1996, JAMA.

[10]  L. Barsalou Ideals, central tendency, and frequency of instantiation as determinants of graded structure in categories. , 1985, Journal of experimental psychology. Learning, memory, and cognition.

[11]  Evaluating the 1990 Projections of Occupational Employment. , 1992 .

[12]  Stefania Macchia,et al.  A system to monitor the quality of automated coding of textual answers to open questions , 2001 .

[13]  M. Andrews Who Is Being Heard? Response Bias in Open-ended Responses in a Large Government Employee Survey , 2005 .

[14]  Jolene D. Smyth,et al.  Open-Ended Questions in Web Surveys Can Increasing the Size of Answer Boxes and Providing Extra Verbal Instructions Improve Response Quality? , 2009 .

[15]  L. Lyberg,et al.  Automated Coding of Survey Responses: An International Review , 2016 .

[16]  R. Anker Gender and Jobs: Sex Segregation of Occupations in the World , 1998 .

[17]  James C Cawley,et al.  Occupational electrical injuries in the United States, 1992-1998, and recommendations for safety research. , 2003, Journal of safety research.

[18]  A. Tversky,et al.  Judgment under Uncertainty , 1982 .

[19]  K. A. Ericsson,et al.  Protocol analysis: Verbal reports as data, Rev. ed. , 1993 .

[20]  L. Rips,et al.  The Psychology of Survey Response , 2000 .

[21]  Lars E. Lyberg,et al.  Some Aspects of Post‐Survey Processing , 1997 .

[22]  Occupational mobility, January 2004 , 2005 .

[23]  P. Campanelli,et al.  A COMPARISON OF INTERVIEWER AND OFFICE CODING OF OCCUPATIONS , 2002 .

[24]  David Bjerk The Differing Nature of Black-White Wage Inequality Across Occupational Sectors , 2007, The Journal of Human Resources.

[25]  A. Tversky,et al.  Extensional versus intuitive reasoning: the conjunction fallacy in probability judgment , 1983 .

[26]  M. Lettau New Estimates for Wage Rate Inequality Using the Employment Cost Index , 2003, The Journal of Human Resources.

[27]  Paul P. Biemer,et al.  Introduction to Survey Quality , 2003 .

[28]  J. Lukasiewicz,et al.  Projections of Occupational Employment, 1988-2000. , 1989 .

[29]  J. Heywood,et al.  Racial Earnings Differentials and Performance Pay , 2005, The Journal of Human Resources.

[30]  Andrea Esuli,et al.  Machines that Learn how to Code Open-Ended Survey Data , 2010 .

[31]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[32]  Glenn D. Israel Effects of Answer Space Size on Responses to Open-ended Questions in Mail Surveys , 2010 .

[33]  D. Norman Categorization of action slips. , 1981 .

[34]  Larry L Jackson,et al.  Occupational injuries among emergency responders. , 2009, American journal of industrial medicine.

[35]  Larry A Layne Occupational injury mortality surveillance in the United States: an examination of census counts from two different surveillance systems, 1992-1997. , 2004, American journal of industrial medicine.

[36]  Nicholas A. Jones,et al.  The Two or More Races Population : 2010 2010 Census , 2012 .

[37]  N. Schwarz,et al.  Thinking About Answers: The Application of Cognitive Processes to Survey Methodology , 1995, Quality of Life Research.

[38]  K. Norén,et al.  Occupational exposure , 1996, Environmental science and pollution research international.