The problem of assessing problem solving: can comparative judgement help?