Assessing the Pedagogical Effectiveness and Conversational Appropriateness in Three Versions of AutoTutor

AutoTutor's effectiveness as a tutor and conversational partner was assessed during three development cycles of the system. In Cycle 1 AutoTutor interacted with virtual students, whereas in Cycles 2 and 3, AutoTutor interacted with human students. The tutoring transcripts for the three cycles were analyzed by two sets of knowledgeable judges. One set of judges rated the pedagogical quality of each AutoTutor dialog move; the other set rated the conversational appropriateness of each move. Data from three evaluative cycles are presented in the paper.