Artifact, bias, and complexity of assessment: the ABCs of reliability.

Interobserver agreement (also referred to here as "reliability") is influenced by diverse sources of artifact, bias, and complexity of the assessment procedures. The literature on reliability assessment frequently has focused on the different methods of computing reliability and the circumstances under which these methods are appropriate. Yet, the credence accorded estimates of interobserver agreement, computed by any method, presupposes eliminating sources of bias that can spuriously affect agreement. The present paper reviews evidence pertaining to various sources of artifact and bias, as well as characteristics of assessment that influence interpretation of interobserver agreement. These include reactivity of reliability assessment, observer drift, complexity of response codes and behavioral observations, observer expectancies and feedback, and others. Recommendations are provided for eliminating or minimizing the influence of these factors from interobserver agreement.

[1]  Donald B. Rubin,et al.  The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles. , 1974 .

[2]  A Dietz,et al.  Expectation biases in observational evaluation of therapeutic change. , 1974, Journal of consulting and clinical psychology.

[3]  J Redfield,et al.  Bias in behavioral observation as a function of observer familiarity with subjects and typicality of behavior. , 1976, Journal of consulting and clinical psychology.

[4]  D P Hartmann,et al.  Considerations in the choice of interobserver reliability estimates. , 1977, Journal of applied behavior analysis.

[5]  K. O’leary,et al.  Shaping data collection congruent with experimental hypotheses. , 1975, Journal of applied behavior analysis.

[6]  S W Bijou,et al.  A method to integrate descriptive and experimental field studies at the level of data and empirical concepts. , 1968, Journal of applied behavior analysis.

[7]  P. M. Scott,et al.  SOCIAL REINFORCEMENT UNDER NATURAL CONDITIONS , 1967 .

[8]  K. O’leary,et al.  Observer reliability as a function of circumstances of assessment. , 1977, Journal of applied behavior analysis.

[9]  I. Goldiamond,et al.  The control of the content of conversation through reinforcement. , 1973, Journal of applied behavior analysis.

[10]  John B. Reid,et al.  Effects of Instructional Set and Experimenter Influence on Observer Reliability. , 1973 .

[11]  J. Reid Reliability Assessment of Observation Data: A Possible Methodological Problem. , 1970 .

[12]  Robert P. Hawkins,et al.  Reliability Scores That Delude: An Alice in Wonderland Trip Through the Misleading Characteristics of Inter-Observer Agreement Scores in Interval Recording. , 1973 .

[13]  K. O’leary,et al.  Measuring the reliability of observational data: a reactive process. , 1973, Journal of applied behavior analysis.

[14]  E. Mash,et al.  Situational Effects on Observer Accuracy: Behavioral Predictability, Prior Experience, and Complexity of Coding Categories. , 1974 .