The lack of standard assessment criteria for reliably comparing usability evaluation methods (UEMs) is an important gap in HCI knowledge. Recently, metrics for assessing thoroughness, validity, and effectiveness of UEMs, based on user data, have been proposed to bridge this gap. This paper reports our findings of applying these proposed metrics in a study that compared heuristic evaluation (HE) to HE-Plus (an extended version of HE). Our experiment showed better overlap among the HE-Plus evaluators than the HE evaluators, demonstrating greater reliability of the method. When evaluation data, from testing the usability of the same website, was used in calculating the UEM performance metrics, HE-Plus was found to be a superior method to HE in all assessment criteria with a 17%, 39%, and 67% improvement in the aspects of thoroughness, validity, and effectiveness, respectively. The paper concludes with a discussion concerning the limitations of the effectiveness of the UEM from which the real users' data was obtained.
[1]
Jakob Nielsen,et al.
Heuristic Evaluation of Prototypes (individual)
,
2022
.
[2]
Gilbert Cockton,et al.
Sale must end: should discount methods be cleared off HCI's shelves?
,
2002,
INTR.
[3]
Jacqueline Brodie,et al.
Extending the Heuristic Evaluation Method through Contextualisation
,
2002
.
[4]
Wayne D. Gray,et al.
Damaged Merchandise? A Review of Experiments That Compare Usability Evaluation Methods
,
1998,
Hum. Comput. Interact..
[5]
Jo Wood,et al.
On the reliability of usability testing
,
2001,
CHI Extended Abstracts.
[6]
Robert C. Williges,et al.
Criteria For Evaluating Usability Evaluation Methods
,
2001,
Int. J. Hum. Comput. Interact..
[7]
Morten Hertzum,et al.
The Evaluator Effect: A Chilling Fact About Usability Evaluation Methods
,
2001,
Int. J. Hum. Comput. Interact..