Psychometric analysis of the performance data of simulation-based assessment: A systematic review and a Bayesian network example

Researchers have shown in multiple studies that simulations and games can be effective and powerful tools for learning and instruction (cf. Mitchell & Savill-Smith, 2004; Kirriemuir & McFarlane, 2004). Most of these studies deploy a traditional pretest-posttest design in which students usually do a paper-based test (pretest) then play the simulation or game and subsequently do a second paper-based test (posttest). Pretest-posttest designs treat the game as a black box in which something occurs that influences subsequent performance on the posttest (Buckley, Gobert, Horwitz, & O'Dwyer, 2010). Less research has been done in which game play product data or process data itself are used as indicators of student proficiency in some area. However, the last decade researchers have started focusing on what is happening inside the black box to an increasing extent and the literature on the topic is growing. To our knowledge, no systematic reviews have been published that investigate the psychometric analysis of performance data of simulation-based assessment (SBA) and game-based assessment (GBA). Therefore, in Part I of this article, a systematic review on the psychometric analysis of the performance data of SBA is presented. The main question addressed in this review is: 'What psychometric strategies or models for treating and analyzing performance data from simulations and games are documented in scientific literature?'. Then, in Part II of this article, the findings of our review are further illustrated by presenting an empirical example of the - according to our review - most applied psychometric model for the analysis of the performance data of SBA, which is the Bayesian network. Both the results from Part I and Part II assist future research into the use of simulations and games as assessment instruments. We present a review on psychometric analysis of simulation-based assessment.We performed the review from the Evidence-Centered Design framework perspective.The Bayes Net is the most used psychometric model for simulation-based assessment.We present an example of a Bayes Net of a real simulation-based assessment.We make recommendations regarding the development of a simulation-based assessment.

[1]  Robert J. Mislevy,et al.  Evidence-Centered Design of Epistemic Games: Measurement Principles for Complex Learning Environments. , 2010 .

[2]  Gregory K. W. K. Chung,et al.  Identifying Key Features of Student Performance in Educational Video Games and Simulations through Cluster Analysis , 2012, EDM 2012.

[3]  Valerie J. Shute,et al.  Modeling, Assessing, and Supporting Key Competencies Within Game Environments , 2010 .

[4]  Robert J. Mislevy,et al.  A Bayes net approach to modeling learning progressions and task performances , 2009 .

[5]  Gregory K. W. K. Chung,et al.  The Feasibility of Using Cluster Analysis to Examine Log Data from Educational Video Games. CRESST Report 790. , 2011 .

[6]  Gregory K. W. K. Chung,et al.  Examining Feedback in an Instructional Video Game Using Process Data and Error Analysis. CRESST Report 817. , 2012 .

[7]  Janice D. Gobert,et al.  Leveraging Educational Data Mining for Real-time Performance Assessment of Scientific Inquiry Skills within Microworlds , 2012, EDM 2012.

[8]  Richard Wainess,et al.  Automatic Assessment of Complex Task Performance in Games and Simulations. CRESST Report 775. , 2010 .

[9]  Robert J. Mislevy,et al.  Specifying and Refining a Measurement Model for a Computer-Based Interactive Assessment , 2004 .

[10]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[11]  Richard Wainess,et al.  A Conceptual Framework for Assessing Performance in Games and Simulations. CRESST Report 771. , 2010 .

[12]  Troy D. Sadler,et al.  Cognitive diagnostic like approaches using neural-network analysis of serious educational videogames , 2014, Comput. Educ..

[13]  S. Klinkenberg,et al.  Computer adaptive practice of Maths ability using a new item response model for on the fly ability and difficulty estimation , 2011, Comput. Educ..

[14]  B. Yegnanarayana,et al.  Artificial Neural Networks , 2004 .

[15]  Kai Virtanen,et al.  Analyzing air combat simulation results with dynamic bayesian networks , 2007, 2007 Winter Simulation Conference.

[16]  R. Levy Psychometric and Evidentiary Advances, Opportunities, and Challenges for Simulation-Based Assessment , 2013 .

[17]  R. Almond,et al.  Making Sense of Data From Complex Assessments , 2002 .

[18]  John Kirriemuir,et al.  Literature Review in Games and Learning , 2004 .

[19]  P. Glasziou,et al.  Identifying studies for systematic reviews of diagnostic tests was difficult due to the poor sensitivity and precision of methodologic filters and the lack of information in the abstract. , 2005, Journal of clinical epidemiology.

[20]  V. Elizabeth Owen,et al.  Game-based assessment: an integrated model for capturing evidence of learning in play , 2014, Int. J. Learn. Technol..

[21]  Robert J. Mislevy,et al.  Putting ECD into Practice: The Interplay of Theory and Data in Evidence Models within a Digital Learning Environment , 2012, EDM 2012.

[22]  Robert J. Mislevy,et al.  Design and Discovery in Educational Assessment: Evidence-Centered Design, Psychometrics, and Educational Data Mining , 2012, EDM 2012.

[23]  Jodi L. Davenport,et al.  Next-Generation Environments for Assessing and Promoting Complex Science Learning. , 2013 .

[24]  V. Shute,et al.  Stealth Assessment: Measuring and Supporting Learning in Video Games , 2013 .

[25]  Robert J. Mislevy,et al.  Automated scoring of complex tasks in computer-based testing , 2006 .

[26]  Theodorus Johannes Hendrikus Maria Eggen,et al.  The effectiveness of methods for providing written feedback through a computer-based assessment for learning: a systematic review , 2011 .

[27]  Ron Stevens,et al.  Assessing Student Problem-Solving Skills With Complex Computer-Based Tasks , 2002 .

[28]  Michael J. Timms,et al.  Research Article Science Assessments for All: Integrating Science Simulations Into Balanced State Science Assessment Systems , 2012 .

[29]  Rafael Rumí,et al.  Bayesian networks in environmental modelling , 2011, Environ. Model. Softw..

[30]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[31]  Paul Horwitz,et al.  Looking inside the black box: assessing model-based learning and inquiry in BioLogicaTM , 2010, Int. J. Learn. Technol..

[32]  Chris Dede,et al.  Assessment, Technology, and Change , 2010 .

[33]  Russell G. Almond,et al.  Bayes Nets in Educational Assessment: Where the Numbers Come From , 1999, UAI.

[34]  Russell G. Almond,et al.  Bayes Nets in Educational Assessment: Where Do the Numbers Come from? CSE Technical Report. , 2000 .

[35]  M. Csíkszentmihályi Flow. The Psychology of Optimal Experience. New York (HarperPerennial) 1990. , 1990 .

[36]  Gregory K. W. K. Chung,et al.  Use of a Survival Analysis Technique in Understanding Game Performance in Instructional Games. CRESST Report 812. , 2012 .

[37]  M. Petticrew,et al.  Systematic Reviews in the Social Sciences: A Practical Guide , 2005 .

[38]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[39]  Robert J. Mislevy,et al.  A Bayesian Network Approach to Modeling Learning Progressions and Task Performance. CRESST Report 776. , 2010 .

[40]  Brian C. Nelson,et al.  Evidence-centered Design for Diagnostic Assessment within Digital Learning Environments: Integrating Modern Psychometrics and Educational Data Mining , 2012, EDM 2012.

[41]  Dirk Ifenthaler,et al.  Computer-Based Diagnostics and Systematic Analysis of Knowledge , 2010 .

[42]  Kevin B. Korb,et al.  Bayesian Artificial Intelligence, Second Edition , 2010 .

[43]  Kevin B. Korb,et al.  Bayesian Artificial Intelligence , 2004, Computer science and data analysis series.

[44]  Roy Levy Dynamic Bayesian Network Modeling of Game Based Diagnostic Assessments. CRESST Report 837. , 2014 .

[45]  Michael J. Timms,et al.  The promise of simulation-based science assessment: the Calipers project , 2010, Int. J. Learn. Technol..

[46]  V. Shute,et al.  Melding the Power of Serious Games and Embedded Assessment to Monitor and Foster Learning: Flow and Grow , 2009 .

[47]  Bernard P. Veldkamp,et al.  A blending of computer-based assessment and performance-based assessment: Multimedia-Based Performance Assessment (MBPA). The introduction of a new method of assessment in Dutch Vocational Education and Training (VET) , 2014 .