Observer bias in animal behaviour research: can we believe what we score, if we score what we believe?

Most observers in behaviour studies are aware of relevant information about the animals being observed. We investigated whether observer expectations influence subjective scoring methods during a class practicum. Veterinary students were trained in recording negative and positive interactions between pigs, in scoring the degree of panting in cattle and in applying qualitative behaviour assessment (QBA) using a fixed set of terms for assessing hens' behaviour. The students applied these methods in three trials in which they were shown duplicated video recordings of the same animals: the original and a slightly modified version (to prevent recognition at second viewing). When scoring the duplicated recordings they were told either correct or false information about the conditions in which the animals had been filmed. The false information reflected plausible study scenarios in ethology and was used to create expectations about the outcome. As in reality the students scored the identical behaviour twice, the difference in the scores for the original and modified recordings reflects expectation bias due to providing different contextual information. In all trials there was evidence of expectation bias: students scored the ratio of positive to negative interactions higher when told that the observed pigs had been selected for high social breeding value, they scored cattle panting higher when told that the ambient temperature was 5 °C higher than in reality, and in the QBA they indicated more positive and fewer negative emotions when told that the hens were from an organic instead of a conventional farm. The magnitude of the bias in the QBA trial was related to the opinion of the students about hen welfare in organic versus conventional farms. Although veterinary students may not be representative of practising ethologists, these findings do indicate that observer bias could influence subjective scores of animal behaviour and welfare.

[1]  D. Marsh,et al.  Observer gender and observation bias in animal behaviour research: experimental tests with red-backed salamanders , 2004, Animal Behaviour.

[2]  Larry E Miller,et al.  The blind leading the blind: use and misuse of blinding in randomized controlled trials. , 2011, Contemporary clinical trials.

[3]  D. Lerman,et al.  Applying signal-detection theory to the study of observer accuracy and bias in behavioral assessment. , 2010, Journal of applied behavior analysis.

[4]  F. Wemelsfelder How animals communicate quality of life: the qualitative assessment of animal behaviour , 2007 .

[5]  A Van Nuffel,et al.  Repeatability of lameness, fear and slipping scores to assess animal welfare upon arrival in pig slaughterhouses. , 2010, Animal : an international journal of animal bioscience.

[6]  P. Wason On the Failure to Eliminate Hypotheses in a Conceptual Task , 1960 .

[7]  S Millet,et al.  Comparison of the inter- and intra-observer repeatability of three gait-scoring scales for sows. , 2014, Animal : an international journal of animal bioscience.

[8]  Isabelle Boutron,et al.  Observer bias in randomised clinical trials with binary outcomes: systematic review of trials with both blinded and non-blinded outcome assessors , 2012, BMJ : British Medical Journal.

[9]  Marian Stamp Dawkins,et al.  Observing Animal Behaviour: Design and analysis of quantitative data , 2007 .

[10]  Paul S. Martin,et al.  Measuring Behaviour: An Introductory Guide , 1986 .

[11]  F. Wemelsfelder,et al.  Qualitative Behavioural Assessment of emotionality in pigs , 2012, Applied animal behaviour science.

[12]  W. Maertens,et al.  Reliability of categorical versus continuous scoring of welfare indicators: lameness in cows as a case study , 2009, Animal Welfare.

[13]  Philip N. Lehner,et al.  Handbook of ethological methods , 1979 .

[14]  Emma Roe,et al.  The performance of farm animal assessment , 2011, Animal Welfare.

[15]  Douglas G Altman,et al.  Blinding in clinical trials and other studies , 2000, BMJ : British Medical Journal.

[16]  E. Knol,et al.  The Contribution of Social Effects to Heritable Variation in Finishing Traits of Domestic Pigs (Sus scrofa) , 2008, Genetics.

[17]  R. J. Hayes,et al.  Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. , 1995, JAMA.

[18]  “The Eye of the Beholder”: A Classroom Demonstration of Observer Bias , 1994 .

[19]  T. Mader,et al.  Environmental factors influencing heat stress in feedlot cattle. , 2006, Journal of animal science.

[20]  H. J. Blokhuis,et al.  The Welfare Quality® project and beyond: Safeguarding farm animal well-being , 2010 .

[21]  Gordon M. Burghardt,et al.  Perspectives – Minimizing Observer Bias in Behavioral Studies: A Review and Recommendations , 2012 .

[22]  Kenneth F Schulz,et al.  Blinding in randomised trials: hiding who got what , 2002, The Lancet.

[23]  J. Rushen,et al.  A training programme to ensure high repeatability of injury scoring of dairy cows , 2012 .

[24]  M. Mendl,et al.  Assessing the ‘whole animal’: a free choice profiling approach , 2001, Animal Behaviour.

[25]  M.Sc.Dent. Matt Blenkin B.D.Sc. Context Effects and Observer Bias—Implications for Forensic Odontology , 2011 .

[26]  William T. Hoyt,et al.  Magnitude and moderators of bias in observer ratings: A meta-analysis. , 1999 .

[27]  I. Chalmers,et al.  The Landscape and Lexicon of Blinding in Randomized Trials , 2002, Annals of Internal Medicine.

[28]  Jane Taylor,et al.  Context Effects and Observer Bias—Implications for Forensic Odontology , 2012, Journal of forensic sciences.

[29]  R. Rosenthal Experimenter effects in behavioral research , 1968 .

[30]  William C. Thompson,et al.  The Daubert/Kumho Implications of Observer Effects in Forensic Science: Hidden Problems of Expectation and Suggestion , 2002 .

[31]  H. Whay,et al.  Welfare assessment: indices from clinical observation , 2004, Animal Welfare.

[32]  T. Mader,et al.  A new heat load index for feedlot cattle. , 2008, Journal of animal science.

[33]  C. Leeb,et al.  Formal animal-based welfare assessment in UK certification systems , 2007 .

[34]  N. Toft,et al.  Inter-observer agreement, diagnostic sensitivity and specificity of animal-based indicators of young lamb welfare. , 2013, Animal : an international journal of animal bioscience.

[35]  T. Kaptchuk The double-blind, randomized, placebo-controlled trial: gold standard or golden calf? , 2001, Journal of clinical epidemiology.

[36]  R. Meagher,et al.  Observer ratings: validity and value as a tool for animal welfare research , 2009 .

[37]  R. Nickerson Confirmation Bias: A Ubiquitous Phenomenon in Many Guises , 1998 .

[38]  J. Hilden,et al.  Observer bias in randomized clinical trials with measurement scale outcomes: a systematic review of trials with both blinded and nonblinded assessors , 2013, Canadian Medical Association Journal.

[39]  David Moher,et al.  Reporting Methods of Blinding in Randomized Trials Assessing Nonpharmacological Treatments , 2007, PLoS medicine.