Using predictive human performance models to inspire and support UI design recommendations

Predictive human performance modeling has traditionally been used to make quantitative comparisons between alternative designs (e.g., task execution time for skilled users) instead of identifying UI problems or making design recommendations. This note investigates how reliably novice modelers can extract design recommendations from their models. Many HCI evaluation methods have been plagued by the "evaluator effect" [3], i.e., different people using the same method find different UI problems. Our data and analyses show that predictive human performance modeling is no exception. Novice modelers using CogTool [5] display a 34% Any-Two Agreement in their design recommendations, a result in the upper quartile of evaluator effect studies. However, because these recommendations are grounded in models, they may have more reliable impact on measurable performance than recommendations arising from less formal methods.