Oranges and Apples? Using Comparative Judgement for Reliable Briefing Paper Assessment in Simulation Games

Achieving a fair and rigorous assessment of participants in simulation games represents a major challenge. Not only does the difficulty apply to the actual negotiation part, but it also extends to the written assignments that typically accompany a simulation. For one thing, if different raters are involved, it is important to assure that differences in severity do not affect the grades. Recently, comparative judgement (CJ) has been introduced as a method allowing for a team-based grading. This chapter discusses in particular the potential of comparative judgement for assessing briefing papers from 84 students. Four assessors completed 622 comparisons in the Digital Platform for the Assessment of Competences (D-PAC) tool. Results indicate a reliability level of 0.71 for the final rank order, which had demanded a time investment around 10.5 h from the team of assessors. Next to this, there was no evidence of bias towards the most important roles in the simulation game. The study also details how the obtained rank orders were translated into grades, ranging from 11 to 17 out of 20. These elements showcase CJ’s advantage in reaching adequate reliability levels for briefing papers in an efficient manner.

[1]  D. Royce Sadler,et al.  Indeterminacy in the use of preset criteria for assessment and grading , 2009 .

[2]  Ian Jones,et al.  ASSESSING MATHEMATICAL PROBLEM SOLVING USING COMPARATIVE JUDGEMENT , 2015 .

[3]  Stephen Humphry,et al.  Using the method of pairwise comparison to obtain reliable teacher assessments , 2010 .

[4]  Jeffrey Chin,et al.  Assessment in Simulation and Gaming , 2009 .

[5]  A. Pollitt The method of Adaptive Comparative Judgement , 2012 .

[6]  Chad Raymond,et al.  Assessment in Simulations , 2013 .

[7]  Simon Raiser,et al.  simulating europe: choosing the right learning objectives for simulation games , 2015 .

[8]  Minsoo Kang,et al.  Sources of acute sport stress scale for sports officials: Rasch calibration. , 2013 .

[9]  Stephen M. Humphry,et al.  Using calibrated exemplars in the teacher-assessment of writing: an empirical study , 2013 .

[10]  Ian Jones,et al.  Peer assessment without assessment criteria , 2014 .

[11]  M. Inglis,et al.  MEASURING CONCEPTUAL UNDERSTANDING: THE CASE OF FRACTIONS , 2013 .

[12]  Ian Jones,et al.  A comparative judgement approach to teacher assessment , 2015 .

[13]  L. Thurstone A law of comparative judgment. , 1994 .

[14]  M. Price,et al.  Let’s stop the pretence of consistent marking: exploring the multiple limitations of assessment criteria , 2016 .

[15]  Chad Raymond,et al.  Do Role-Playing Simulations Generate Measurable and Meaningful Outcomes? A Simulation’s Effect on Exam Scores and Teaching Evaluations , 2010 .

[16]  Michael Baranowski,et al.  Political Simulations: What We Know, What We Think We Know, and What We Still Need to Know , 2015 .

[17]  simon obendorf,et al.  Evaluating the Model United Nations: Diplomatic Simulation as Assessed Undergraduate Coursework , 2013 .

[18]  Philippe Perchoc Les simulations européennes : Généalogie d’une adaptation au Collège d’Europe , 2016 .