The a(d) coefficient as a descriptive measure of the within-group agreement of ratings.

The a(d) coefficient was developed to measure the within-group agreement of ratings. The underlying theory as well as the construction of the coefficient are explained. The a(d) coefficient ranges from 0 to 1, regardless of the number of scale points, raters, or items. With some limitations the measure of the within-group agreement of different groups and groups from different studies is directly comparable. For statistical significance testing, the binomial distribution is introduced as a model of the ratings' random distribution given the true score of a group construct. This method enables a decision about essential agreement and not only about a significant difference from 0 or a chosen critical value. The a(d) coefficient identifies a single true score within a group. It is not provided for multiple true score settings. The comparison of the a(d) coefficient with other agreement indices shows that the new coefficient is in line with their outcomes, but does not result in infinite or inappropriate values.

[1]  C. Lance,et al.  The Sources of Four Commonly Reported Cutoff Criteria , 2006 .

[2]  Neil M. A. Hauenstein,et al.  Interrater Agreement Reconsidered: An Alternative to the rwg Indices , 2005 .

[3]  Kristin Smith-Crowe,et al.  Accurate tests of statistical significance for r(WG) and average deviation interrater agreement indexes. , 2003, The Journal of applied psychology.

[4]  E Doveh,et al.  Statistical properties of the rWG(J) index of agreement. , 2001, Psychological methods.

[5]  Suzanne S. Masterson,et al.  A trickle-down model of organizational justice: relating employees' and customers' perceptions of and reactions to fairness. , 2001, The Journal of applied psychology.

[6]  S. Kozlowski,et al.  From Micro to Meso: Critical Steps in Conceptualizing and Conducting Multilevel Research , 2000 .

[7]  M. Lindell,et al.  A Revised Index of Interrater Agreement for Multi-Item Ratings of a Single Target , 1999 .

[8]  Michael J. Burke,et al.  On Average Deviation Indices for Estimating Interrater Agreement , 1999 .

[9]  Sandra E. Moritz,et al.  Levels of analysis issues in group psychology: Using efficacy as an example of a multilevel model. , 1998 .

[10]  D. Chan Functional Relations among Constructs in the Same Content Domain at Different Levels of Analysis: A Typology of Composition Models , 1998 .

[11]  Michael K. Lindell,et al.  Measuring Interrater Agreement for Ratings of a Single Target , 1997 .

[12]  L. James,et al.  rwg: An assessment of within-group interrater agreement. , 1993 .

[13]  S. Kozlowski,et al.  A disagreement about within-group agreement: Disentangling issues of consistency versus consensus. , 1992 .

[14]  John E. Hunter,et al.  Interrater reliability coefficients cannot be computed when only one stimulus is rated. , 1989 .

[15]  R. Fowler Testing for substantive significance in applied research by specifying nonzero effect null hypotheses. , 1985 .

[16]  L. James,et al.  Estimating within-group interrater reliability with and without response bias. , 1984 .

[17]  R. H. Finn A Note on Estimating the Reliability of Categorical Data , 1970 .

[18]  P. Bliese Within-group agreement, non-independence, and reliability: Implications for data aggregation and analysis. , 2000 .

[19]  S. Kozlowski,et al.  A multilevel approach to theory and research in organizations: Contextual, temporal, and emergent processes. , 2000 .