Estimating within-group interrater reliability with and without response bias.

Abstract : This article presents methods for assessing agreement among the judgments made by a single group of judges on a single variable in regard to a single target. For example, the group of judges could be editorial consultants, members of an assessment center, or members of a team. The single target could be a manuscript, a lower-level manager, or a team. The variable on which the target is judged could be overall publishability in the case of the manuscript, managerial potential for the lower-level manager, or team cooperativeness for the team. The methods presented are based on new procedures for estimating interrater reliability. For situations such as the above, these procedures are shown to furnish more accurate and interpretable estimates of agreement than estimates provided by procedures commonly used to estimate agreement, consistency, or interrater reliability. In addition, the proposed methods include processes for controlling for the spurious influences of response biases (e.g., positive leniency, social desirability) on estimates of interrater reliability. (Author)

[1]  L. Cronbach Response Sets and Test Validity , 1946 .

[2]  Raymond B. Cattell,et al.  rp and other coefficients of pattern similarity , 1949, Psychometrika.

[3]  L. Cronbach Further Evidence on Response Sets and Test Design , 1950 .

[4]  L. Cronbach,et al.  Assessing similarity between profiles. , 1953, Psychological bulletin.

[5]  I. A. Berg,et al.  Response Bias in an Unstructured Questionnaire , 1954 .

[6]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[7]  J. Overall NOTE ON MULTIVARIATE METHODS FOR PROFILE ANALYSIS. , 1964, Psychological bulletin.

[8]  Leonard G. Rorer THE GREAT RESPONSE-STYLE MYTH. , 1965 .

[9]  A. Parducci Category judgment: a range-frequency model. , 1965, Psychological review.

[10]  S. Messick,et al.  RESPONSE STYLES AS PERSONALITY VARIABLES: A THEORETICAL INTEGRATION OF MULTIVARIATE RESEARCH1 , 1965 .

[11]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[12]  J. Cohen gamma c: a profile similarity coefficient invariant over variable reflection. , 1969, Psychological bulletin.

[13]  R. H. Finn A Note on Estimating the Reliability of Categorical Data , 1970 .

[14]  B. Fischhoff,et al.  Cognitive Processes and Societal Risk Taking , 1976 .

[15]  J. Bartko,et al.  On Various Intraclass Correlation Reliability Coefficients , 1976 .

[16]  R.B.G. Selvage Comments on the Analysis of Variance Strategy for the Computation of Intraclass Reliability , 1976 .

[17]  Martin Cooper An Exact Probability Test for Use with Likert-Type Scales , 1976 .

[18]  John G. Howe,et al.  Group climate: An exploratory analysis of construct validity , 1977 .

[19]  S. Streufert,et al.  Behavior in the complex environment. , 1978 .

[20]  Walter C. Borman,et al.  Exploring upper limits of reliability and validity in job performance ratings. , 1978 .

[21]  Agreement or Disagreement of a Set of Likert-Type Ratings , 1979 .

[22]  W. Borman,et al.  Format and training effects on rating accuracy and rater errors , 1979 .

[23]  Response Tendency in a Questionnaire without Questions , 1979 .

[24]  Jay Magidson,et al.  Advances in factor analysis and structural equation models , 1979 .

[25]  W. A. Scott,et al.  Cognitive Structure, Theory and Measurement of Individual Differences , 1979 .

[26]  S. K. Mitchell Interobserver agreement, reliability, and generalizability of data collected in observational studies. , 1979 .

[27]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[28]  H. John Bernardin,et al.  Effects of rater training: Creating new response sets and decreasing accuracy. , 1980 .

[29]  Chester A. Schriesheim,et al.  The Effect of Grouping or Randomizing Items on Leniency Response Bias , 1981 .

[30]  L. James Aggregation Bias in Estimates of Perceptual Agreement. , 1982 .

[31]  Robert J. Wherry,et al.  THE CONTROL OF BIAS IN RATINGS: A THEORY OF RATING , 1982 .

[32]  David M. Messick,et al.  Some Cheap Tricks for Making Inferences about Distribution Shapes from Variances , 1982 .

[33]  L. James,et al.  CROSS‐SITUATIONAL SPECIFICITY IN MANAGERs' PERCEPTIONS OF SUBORDINATE PERFORMANCE, ATTRIBUTIONS, AND LEADER BEHAVIORS , 1983 .

[34]  Fritz Drasgow,et al.  Item response theory : application to psychological measurement , 1983 .

[35]  R. Downey,et al.  Intraclass Correlations: There's More There Than Meets the Eye , 1983 .

[36]  Latent trait theory for organizational research , 1983 .