Assessing Student Self-Explanations in an Intelligent Tutoring System Vasile Rus (vrus@memphis.edu) Department of Computer Science Institute for Intelligent Systems The University of Memphis Memphis. TN 38152 Arthur C. Graesser (a-graesser@memphis.edu) Philip M. McCarthy (pmmccrth@memphis.edu) Department of Psychology Institute for Intelligent Systems The University of Memphis Memphis. TN 38152 Department of Psychology Institute for Intelligent Systems The University of Memphis Memphis. TN 38152 Danielle S. McNamara (d.mcnamara@mail.psyc.memphis.edu) Mihai C. Lintean (M.Lintean@memphis.edu) Department of Computer Science Institute for Intelligent Systems The University of Memphis Memphis. TN 38152 Department of Psychology Institute for Intelligent Systems The University of Memphis Memphis. TN 38152 generated from a lexico-syntactic computational tool called The Entailer (Rus et al., 2005; Rus, McCarthy, & Graesser, Our corpus of natural language input generated from an ITS is student contributions from users of iSTART (Interactive Strategy Training for Active Reading and Thinking; McNamara, Levinstein, & Boonthum 2004), a web based tutoring system that provides students with self- explanation and reading strategy training. The iSTART student statements were sampled from the final phase of iSTART training. During this stage, a pedagogical agent reads sentences from a textbook aloud and asks the student to type a self-explanation of each sentence. The focus of this study is to distinguish two very similar student self- explanation categories: Topic identification sentences and Paraphrases. This distinction is challenging because the lexicon used for both topic identification and paraphrase tends to largely overlap with the iSTART target sentences. Thus, for the iSTART agent to provide the most appropriate feedback to the student, accurate algorithms are required to successfully interpret the student’s input and make this distinction. This study tests various measures for evaluating student input and formulates an algorithm from a combination of successful indices. The algorithm accurately assesses the student input, distinguishing topic sentence type self explanations from paraphrase-type self explanation. Thus, once implemented, iSTART agents will be able to provide more informative feedback to students. Abstract Research indicates that guided feedback facilitates learning, whether in the classroom or with Intelligent Tutoring Systems (ITS). Improving the accuracy of the evaluation of user input is therefore necessary for providing optimal feedback. This study investigated an automated assessment of students’ input that involved a lexico-syntactic (entailment) approach to textual analysis along with a variety of other textual assessment measures. The corpus consisted of 357 student responses taken from a recent experiment with iSTART, an ITS that provides students with self-explanation and reading strategy training. The results of our study indicated that the entailment approach provided the highest single measure of accuracy for assessing input when compared to the other measures in the study. A set of indices working in conjunction with the entailment approach provided the best overall assessments. Keywords: entailment; intelligent tutoring systems; iSTART; paraphrase; latent semantic analysis. Introduction A major challenge for Intelligent Tutoring Systems (ITSs) that incorporate natural language interaction is to accurately evaluate users’ contributions and to produce accurate feedback. Available research in the learning sciences indicates that guided feedback and explanation is more effective than simply providing an indication of rightness or wrongness of student input (Aleven & Koedinger, 2002; Anderson et al., 1989; Kluger & DeLisi, 1996; McKendree, 1990; Sims-Knight & Upchurch, 2001). And the benefits of feedback specifically in ITS are equally evident (Azevedo & Bernard, 1995). This study addresses the challenge of evaluating users’ textual input in ITS environments. More specifically, we assess entailment evaluations that are Interactive Strategy Training for Active Reading and Thinking (iSTART) iSTART provides young adolescent to college-aged students with tutored self-explanation and reading strategy training via pedagogical agents (McNamara et al., 2004). iSTART is designed to improve students ability to self-explain by teaching them to use reading strategies such as
[1]
Danielle S. McNamara,et al.
Analyzing Writing Styles with Coh-Metrix
,
2006,
FLAIRS.
[2]
Roger Azevedo,et al.
A Meta-Analysis of the Effects of Feedback in Computer-Based Instruction
,
1995
.
[3]
George A. Miller,et al.
WordNet: A Lexical Database for English
,
1995,
HLT.
[4]
Arthur C. Graesser,et al.
AutoTutor: an intelligent tutoring system with mixed-initiative dialogue
,
2005,
IEEE Transactions on Education.
[5]
Vincent Aleven,et al.
An effective metacognitive strategy: learning by doing and explaining with a computer-based Cognitive Tutor
,
2002,
Cogn. Sci..
[6]
Arthur C. Graesser,et al.
Analysis of a Textual Entailer
,
2006,
CICLing.
[7]
David M. Magerman.
Natural Language Parsing as Statistical Pattern Recognition
,
1994,
ArXiv.
[8]
A. Kluger,et al.
The effects of feedback interventions on performance: A historical review, a meta-analysis, and a preliminary feedback intervention theory.
,
1996
.
[9]
R. L. Upchurch,et al.
What's Wrong with Giving Students Feedback?
,
2001
.
[10]
Danielle S. McNamara,et al.
Self-explaining science texts: strategies, knowledge, and reading skill
,
2004
.
[11]
John R. Anderson,et al.
Cognitive Tutors: Lessons Learned
,
1995
.
[12]
Jean McKendree,et al.
Effective Feedback Content for Tutoring Complex Skills
,
1990,
Hum. Comput. Interact..
[13]
Danielle S McNamara,et al.
iSTART: Interactive strategy training for active reading and thinking
,
2004,
Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.
[14]
Eugene Charniak,et al.
A Maximum-Entropy-Inspired Parser
,
2000,
ANLP.
[15]
Ido Dagan,et al.
The Third PASCAL Recognizing Textual Entailment Challenge
,
2007,
ACL-PASCAL@ACL.
[16]
Arthur C. Graesser,et al.
Assessing Entailer with a Corpus of Natural Language from an Intelligent Tutoring System
,
2007,
FLAIRS Conference.