(ProQuest: ... denotes formulae omitted.)1. IntroductionAdvances in both technology and cognitively-based assessment design are the drivers towards a radically new vision of assessment, which holds the promise of increasing validity, reliability, and generalizability of the test scores (e.g., Zenisky, & Sireci, 2002). For example, the National Assessment of Educational Progress (NAEP) has embarked on including Scenario-Based Tasks (SBTs) in its Technology and Engineering Literacy (TEL) assessment. SBTs are interactive tasks in which students need to solve problems within realistic scenarios.Although advances in technology allow for new opportunities for educational learning and measurement, and influence the task design, delivery, and data collection, these new technology-enhanced SBTs bring with them new challenges for the analysis and modeling of the data. These challenges are due to the myriad of possibilities of responses and the ill-defined unit of measurement in this complex solution space (Levy, 2012).Unlike multiple-choice items, SBTs provide students with a relatively open workspace to solve a problem, that is, students are allowed to exercise a greater freedom in how they approach problems posed by the tasks. As a result, different students may use different processes for resolving the problems in the tasks. The term process data is used to refer to all of the tracked steps that a student takes to solve a problem in a SBT. Task analysis and scoring, which normally focus exclusively on outcomes of the problem-solving activity, cannot address the question of whether meaningful differences exist among students' different approaches/processes to solving the problem. For instance, what features in the tracked steps are characteristic of successful approaches to a problem? How can unsuccessful strategies be described and distinguished from one another? Progress on broad questions like these depends on having reliable and valid quantitative approaches for identifying and describing students' response processes for new types of items. In this paper, we address two interwoven research questions: (1) how to characterize the process data, so that the key features of students' processes can be captured, and thus, the differences among processes can be distinguished, and (2) how to use the identified features of students' response processes to make inferences about target constructs.In the educational testing field, there is a strong interest in inferring the individual students' abilities based on their response processes. Recent work focused on scoring and characterizing the process data; see for example, a set of papers focused on analyzing the NAEP TEL process data, such as Hao et al (2015) where a measure borrowed from the text analysis called "the editing distance" was introduced to describe score students' processes, Bergner et al (2014) where clustering analysis was proposed for characterizing the process data, and Zhu et al (2016) where the social network analysis was applied to the steps and sequences of the students' processes.In this paper, we propose an approach inspired by the classic Markov models and Item Response Theory (IRT) models to model the process of solving problems in SBTs. We start with a more general theoretical description in order to introduce the method. Then, we narrow it down for this analysis of the data example. Like for the classic Markov models, we first assume that a student's response process has a Markov property, that is, the next state of a stochastic process only depends on the present state. Like the IRT models, the proposed approach utilizes individual-level latent variables to characterize the features of each individual student's response process. Markov-IRT model hereafter is used to refer to this proposed approach.In the rest of the paper, we first present the task, called the Wells task, from the NAEP TEL assessment, and then we present and discuss the proposed Markov-IRT model using the task described previously. …
[1]
Yoav Bergner,et al.
Visualization and Confirmatory Clustering of Sequence Data from a Simulation-Based Assessment Task
,
2014,
EDM.
[2]
Stephen G. Sireci,et al.
Technological Innovations in Large-Scale Assessment
,
2002
.
[3]
L. Baum,et al.
Statistical Inference for Probabilistic Functions of Finite State Markov Chains
,
1966
.
[4]
Jiangang Hao,et al.
Analyzing Process Data from Game/Scenario-Based Tasks: An Edit Distance Approach
,
2015,
EDM 2015.
[5]
F. V. D. Pol,et al.
MIXED MARKOV LATENT CLASS MODELS
,
1990
.
[6]
Mengxiao Zhu,et al.
Using Networks to Visualize and Analyze Process Data for Educational Assessment
,
2016
.
[7]
E. Seneta.
Markov and the birth of chain dependence theory
,
1996
.
[8]
R. R. Bush,et al.
A Mathematical Model for Simple Learning
,
1951
.
[9]
R. Bellman.
A Markovian Decision Process
,
1957
.
[10]
Shelby J. Haberman,et al.
A GENERAL PROGRAM FOR ITEM‐RESPONSE ANALYSIS THAT EMPLOYS THE STABILIZED NEWTON–RAPHSON ALGORITHM
,
2013
.
[11]
Roy Levy,et al.
Psychometric Advances, Opportunities, and Challenges for Simulation-Based Assessment
,
2012
.
[12]
Shelby J. Haberman.
AN ELEMENTARY TEST OF THE NORMAL 2PL MODEL AGAINST THE NORMAL 3PL ALTERNATIVE
,
2006
.
[13]
Biing-Hwang Juang,et al.
HMM clustering for connected word recognition
,
1989,
International Conference on Acoustics, Speech, and Signal Processing,.
[14]
Lawrence R. Rabiner,et al.
A tutorial on hidden Markov models and selected applications in speech recognition
,
1989,
Proc. IEEE.
[15]
Russell G. Almond,et al.
A Sample Assessment Using the Four Process Framework. CSE Technical Report 543.
,
2001
.
[16]
W. Estes.
Toward a Statistical Theory of Learning.
,
1994
.
[17]
E. Muraki.
A GENERALIZED PARTIAL CREDIT MODEL: APPLICATION OF AN EM ALGORITHM
,
1992
.
[18]
Shelby J. Haberman,et al.
USE OF GENERALIZED RESIDUALS TO EXAMINE GOODNESS OF FIT OF ITEM RESPONSE MODELS
,
2009
.
[19]
S. Haberman,et al.
Analysis of Categorical Response Profiles By Informative Summaries
,
2001
.
[20]
R. Hambleton,et al.
Item Response Theory: Principles and Applications
,
1984
.
[21]
K. Tatsuoka.
RULE SPACE: AN APPROACH FOR DEALING WITH MISCONCEPTIONS BASED ON ITEM RESPONSE THEORY
,
1983
.
[22]
L. J. Savage.
Elicitation of Personal Probabilities and Expectations
,
1971
.