Applying user testing data to UEM performance metrics

The lack of standard assessment criteria for reliably comparing usability evaluation methods (UEMs) is an important gap in HCI knowledge. Recently, metrics for assessing thoroughness, validity, and effectiveness of UEMs, based on user data, have been proposed to bridge this gap. This paper reports our findings of applying these proposed metrics in a study that compared heuristic evaluation (HE) to HE-Plus (an extended version of HE). Our experiment showed better overlap among the HE-Plus evaluators than the HE evaluators, demonstrating greater reliability of the method. When evaluation data, from testing the usability of the same website, was used in calculating the UEM performance metrics, HE-Plus was found to be a superior method to HE in all assessment criteria with a 17%, 39%, and 67% improvement in the aspects of thoroughness, validity, and effectiveness, respectively. The paper concludes with a discussion concerning the limitations of the effectiveness of the UEM from which the real users' data was obtained.