Reliability in Coding Open-Ended Data: Lessons Learned from HIV Behavioral Research

Analysis of text from open-ended interviews has become an important research tool in numerous fields, including business, education, and health research. Coding is an essential part of such analysis, but questions of quality control in the coding process have generally received little attention. This article examines the text coding process applied to three HIV-related studies conducted with the Centers for Disease Control and Prevention considering populations in the United States and Zimbabwe. Based on experience coding data from these studies, we conclude that (1) a team of coders will initially produce very different codings, but (2) it is possible, through a process of codebook revision and recoding, to establish strong levels of intercoder reliability (e.g., most codes with kappa 0.8). Furthermore, steps can be taken to improve initially poor intercoder reliability and to reduce the number of iterations required to generate stronger intercoder reliability.

[1]  T. Marteau,et al.  The Place of Inter-Rater Reliability in Qualitative Research: An Empirical Study , 1997 .

[2]  D. Cicchetti Guidelines, Criteria, and Rules of Thumb for Evaluating Normed and Standardized Assessment Instruments in Psychology. , 1994 .

[3]  M.S.P.H. Joanne E. Mantell Ph.D.,et al.  Evaluating HIV Prevention Interventions , 1997, AIDS Prevention and Mental Health.

[4]  A. Parry Handbook of Qualitative Research , 2002 .

[5]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[6]  E. Waters,et al.  How much observational data is enough? An empirical test using marital interaction coding. , 2001, Behavior therapy.

[7]  W M Tierney,et al.  Can raters consistently evaluate the content of focus groups? , 1998, Social science & medicine.

[8]  J. Appleton,et al.  Analysing qualitative interview data: addressing issues of validity and reliability. , 2006, Journal of advanced nursing.

[9]  Ann O'Leary,et al.  Association of negotiation strategies with consistent use of male condoms by women receiving an HIV prevention intervention in Zimbabwe , 2003, AIDS.

[10]  Klaus Krippendorff,et al.  On the Reliability of Unitizing Continuous Data , 1995 .

[11]  Janice M. Morse,et al.  "Perfectly Healthy, but Dead": The Myth of Inter-Rater Reliability , 1997 .

[12]  Daniel J. Hruschka,et al.  Fixed-Choice and Open-Ended Response Formats: A Comparison from HIV Prevention Research in Zimbabwe , 2004 .

[13]  R. Weber Basic Content Analysis , 1986 .

[14]  Kevin R. Murphy,et al.  INTERRATER CORRELATIONS DO NOT ESTIMATE THE RELIABILITY OF JOB PERFORMANCE RATINGS , 2000 .

[15]  Walter F. Stenning,et al.  AN EMPIRICAL STUDY , 2003 .

[16]  J. Carlin,et al.  Bias, prevalence and kappa. , 1993, Journal of clinical epidemiology.

[17]  Tony Hak,et al.  Coder training: Theoretical training or practical socialization? , 1996 .

[18]  S S Wang,et al.  The health care needs of hospitalized patients with AIDS in Taiwan. , 1997, AIDS patient care and STDs.

[19]  J H Hohnloser,et al.  Experiments in coding clinical information: an analysis of clinicians using a computerized coding tool. , 1995, Computers and biomedical research, an international journal.

[20]  David L. Altheide,et al.  Criteria for assessing interpretive validity in qualitative research. , 1994 .

[21]  I. Guggenmoos‐Holzmann,et al.  How reliable are chance-corrected measures of agreement? , 1993, Statistics in medicine.

[22]  Matt G Mutchler,et al.  Comparing Sexual Behavioral Patterns Between Two Bathhouses , 2003, Journal of homosexuality.

[23]  W. James Potter,et al.  Rethinking validity and reliability in content analysis , 1999 .

[24]  J. Drisko Strengthening Qualitative Studies and Reports , 1997 .

[25]  Christine Webb,et al.  Analyzing Data: Maintaining Rigor in a Qualitative Study , 1998 .

[26]  Dale J. Prediger,et al.  Coefficient Kappa: Some Uses, Misuses, and Alternatives , 1981 .

[27]  N Mays,et al.  Qualitative Research: Rigour and qualitative research , 1995 .

[28]  Kenneth Kotovsky,et al.  Complex Information Processing: The Impact of Herbert A. Simon , 1989 .

[29]  M. Banerjee,et al.  Beyond kappa: A review of interrater agreement measures , 1999 .

[30]  James W. Carey,et al.  CDC EZ-Text: Software for Management and Analysis of Semistructured Qualitative Data Sets , 1998 .

[31]  Wendy Sykes,et al.  Taking Stock: Issues from the Literature on Validity and Reliability in Qualitative Research , 1991 .

[32]  J. Parsons,et al.  Determinants of HIV Risk Reduction Behaviors Among Female Partners of Men with Hemophilia and HIV Infection , 1998, AIDS and Behavior.

[33]  J. Richard Landis,et al.  Large sample variance of kappa in the case of different sets of raters. , 1979 .

[34]  H. Bernard Research Methods in Anthropology: Qualitative and Quantitative Approaches , 1988 .

[35]  D. Neumark-Sztainer,et al.  Recommendations from overweight youth regarding school-based weight control programs. , 1997, The Journal of school health.

[36]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[37]  W. Neuman,et al.  Social Research Methods: Qualitative and Quantitative Approaches , 2002 .

[38]  Janice M. Morse,et al.  Nursing Research: The Application of Qualitative Approaches , 1985 .

[39]  Richard E. Boyatzis,et al.  Transforming Qualitative Information: Thematic Analysis and Code Development , 1998 .

[40]  R. Gorden Basic Interviewing Skills , 1992 .

[41]  S. Urbina Essentials of Psychological Testing , 2005, PsyPag Quarterly.

[42]  E. Hagelin,et al.  Coding data from child health records: the relationship between interrater agreement and interpretive burden. , 1999, Journal of pediatric nursing.

[43]  A. J. Conger Integration and generalization of kappas for multiple raters. , 1980 .

[44]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[45]  A. Madill,et al.  Objectivity and reliability in qualitative analysis: realist, contextualist and radical constructionist epistemologies. , 2000, British journal of psychology.

[46]  R. Fitzpatrick,et al.  Qualitative research in health care: I. The scope and validity of methods. , 1996, Journal of evaluation in clinical practice.

[47]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[48]  James W. Carey,et al.  Intercoder Agreement in Analysis of Responses to Open-Ended Interview Questions: Examples from Tuberculosis Research , 1996 .

[49]  Eleanor McLellan,et al.  Codebook Development for Team-Based Qualitative Analysis , 1998 .

[50]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[51]  C. Pope,et al.  Assessing quality in qualitative research , 2000, BMJ : British Medical Journal.

[52]  R. Kolbe,et al.  Content-Analysis Research: An Examination of Applications with Directives for Improving Research Reliability and Objectivity , 1991 .

[53]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[54]  E. Guba,et al.  Competing paradigms in qualitative research. , 1994 .

[55]  J. Fleiss Statistical methods for rates and proportions , 1974 .

[56]  H. Bernard,et al.  Techniques to Identify Themes , 2003 .

[57]  D. Taylor,et al.  A Systematic Approach for Using Qualitative Methods in Primary Prevention Research , 1990 .

[58]  Herbert A. Simon,et al.  The Sciences of the Artificial , 1970 .

[59]  M. Lombard,et al.  Content Analysis in Mass Communication: Assessment and Reporting of Intercoder Reliability , 2002 .