The evaluator effect in usability tests

Usability tests are applied in industry to evaluate systems and in research as a yardstick for other usability evaluation methods. However, one potential threat to the reliability of usability tests has been letI unaddressed: the evaluator effect. In this study, four evaluators analyzed four videotaped usability test sessions. Only 20% of the 93 unique problems were detected by all four evaluators and 46% were detected by only a single evaluator. Severe problems were detected more often by all four evaluators (41%) and less often by only one evaluator (22%) but a substantial evaluator effect remained.