Evidence-Centered Design of Epistemic Games: Measurement Principles for Complex Learning Environments.

We are currently at an exciting juncture in developing effective means for assessing so-called 21st-century skills in an innovative yet reliable fashion. One of these avenues leads through the world of epistemic games (Shaffer, 2006a), which are games designed to give learners the rich experience of professional practica within a discipline. They serve to develop domain-specific expertise based on principles of collaborative learning, distributed expertise, and complex problem-solving. In this paper, we describe a comprehensive research program for investigating the methodological challenges that await rigorous inquiry within the epistemic games context. We specifically demonstrate how the evidence-centered design framework (Mislevy, Almond, & Steinberg, 2003) as well as current conceptualizations of reliability and validity theory can be used to structure the development of epistemic games as well as empirical research into their functioning. Using the epistemic game Urban Science (Bagley & Shaffer, 2009), we illustrate the numerous decisions that need to be made during game development and their implications for amassing qualitative and quantitative evidence about learners’ developing expertise within epistemic games.

[1]  V. Shute Focus on Formative Feedback , 2008 .

[2]  Steven M. Downing,et al.  Handbook of test development , 2006 .

[3]  P. Boeck,et al.  Explanatory item response models : a generalized linear and nonlinear approach , 2004 .

[4]  Satoru Kawai,et al.  An Algorithm for Drawing General Undirected Graphs , 1989, Inf. Process. Lett..

[5]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[6]  Joint Committee on Testing Practices Code of fair testing practices in education , 1990 .

[7]  De Ayala,et al.  The Theory and Practice of Item Response Theory , 2008 .

[8]  Daniel J. Bauer Observations on the Use of Growth Mixture Models in Psychological Research , 2007 .

[9]  David Gibson,et al.  Games And Simulations in Online Learning: Research and Development Frameworks , 2006 .

[10]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[11]  André A. Rupp,et al.  Improving Testing: Applying Process Tools and Techniques to Assure Quality edited by Wild, C. L., & Ramaswamy, R. , 2009 .

[12]  J. Gee,et al.  How Computer Games Help Children Learn , 2006 .

[13]  J. Templin,et al.  Unique Characteristics of Diagnostic Classification Models: A Comprehensive Review of the Current State-of-the-Art , 2008 .

[14]  S. Embretson,et al.  Item response theory for psychologists , 2000 .

[15]  K. Tatsuoka RULE SPACE: AN APPROACH FOR DEALING WITH MISCONCEPTIONS BASED ON ITEM RESPONSE THEORY , 1983 .

[16]  Thomas Ehrlich,et al.  Civic Responsibility and Higher Education , 2000 .

[17]  Jacqueline P. Leighton Avoiding Misconception, Misuse, and Missed Opportunities: The Collection of Verbal Reports in Educational Achievement Testing , 2005 .

[18]  Eric T. Bradlow,et al.  Testlet Response Theory and Its Applications , 2007 .

[19]  Robert J. Mislevy,et al.  Intuitive Test Theory , 2005 .

[20]  Alija Kulenović,et al.  Standards for Educational and Psychological Testing , 1999 .

[21]  Iasonas Lamprianou,et al.  Code of Fair Testing Practices in Education , 2009 .

[22]  Jan L. Plass,et al.  Design factors for educationally effective animations and simulations , 2009, J. Comput. High. Educ..

[23]  Robert J. Mislevy,et al.  Automated scoring of complex tasks in computer-based testing , 2006 .

[24]  Elizabeth Bagley,et al.  When People Get in the Way: Promoting Civic Thinking Through Epistemic Gameplay , 2009, Int. J. Gaming Comput. Mediat. Simulations.

[25]  Denny Borsboom,et al.  Test Validity in Cognitive Assessment , 2007 .

[26]  Winston Bennett,et al.  Performance Measurement : Current Perspectives and Future Challenges , 2006 .

[27]  Robert L. Brennan,et al.  An Essay on the History and Future of Reliability from the Perspective of Replications , 2001 .

[28]  Michael J. Phelan Estimating the transition probabilities from censored Markov renewal processes , 1990 .

[29]  Pamela A. Moss,et al.  Can There Be Validity Without Reliability? , 1994 .

[30]  Robert J. Mislevy,et al.  Can There Be Reliability without “Reliability?” , 2004 .

[31]  S. Messick Validity of Psychological Assessment: Validation of Inferences from Persons' Responses and Performances as Scientific Inquiry into Score Meaning. Research Report RR-94-45. , 1994 .

[32]  M. J. Allen Introduction to Measurement Theory , 1979 .

[33]  James Paul Gee,et al.  What video games have to teach us about learning and literacy , 2007, CIE.

[34]  Gautam Puhan,et al.  Subscores Based on Classical Test Theory: To Report or Not to Report , 2007 .

[35]  David W. Shaffer,et al.  Epistemic frames for epistemic games , 2006, Comput. Educ..

[36]  Jonathan Templin,et al.  Diagnostic Measurement: Theory, Methods, and Applications , 2010 .

[37]  David A. Frisbie,et al.  Reliability of Scores From Teacher‐Made Tests , 1988 .

[38]  S. Haberman WHEN CAN SUBSCORES HAVE VALUE , 2005 .

[39]  Elizabeth Bagley,et al.  Epistemic network analysis : a Prototype for 21 st Century assessment of Learning , 2009 .

[40]  Mark J. Gierl,et al.  Cognitive Diagnostic Assessment for Education: Verbal Reports as Data for Cognitive Diagnostic Assessment , 2007 .

[41]  Valerie J. Shute,et al.  MONITORING AND FOSTERING LEARNING THROUGH GAMES AND EMBEDDED ASSESSMENTS , 2008 .

[42]  D. Campbell,et al.  Convergent and discriminant validation by the multitrait-multimethod matrix. , 1959, Psychological bulletin.

[43]  Russell G. Almond,et al.  On the Structure of Educational Assessments, CSE Technical Report. , 2003 .

[44]  Russell G. Almond,et al.  Bayesian Network Models for Local Dependence Among Observable Outcome Variables , 2006 .

[45]  S. Messick The Interplay of Evidence and Consequences in the Validation of Performance Assessments , 1994 .

[46]  D. Borsboom Educational Measurement (4th ed.) , 2009 .