Statistical Modeling of Student Performance to Improve Chinese Dictation Skills with an Intelligent Tutor

The Pinyin Tutor has been used the past few years at over thirty institutions around the world to teach students to transcribe spoken Chinese phrases into Pinyin. Large amounts of data have been collected from this program on the types of errors students make on this task. We analyze these data to discover what makes this task difficult and use our findings to iteratively improve the tutor. For instance, is a particular set of consonants, vowels, or tones causing the most difficulty? Or perhaps do certain challenges arise in the context in which these sounds are spoken? Since each Pinyin phrase can be broken down into a set of features (for example, consonants, vowel sounds, and tones), we apply machine learning techniques to uncover the most confounding aspects of this task. We then exploit what we learned to construct and maintain an accurate representation of what the student knows for best individual instruction. Our goal is to allow the learner to focus on the aspects of the task on which he or she is having most difficulty, thereby accelerating his or her understanding of spoken Chinese beyond what would be possible without such focused "intelligent" instruction.

[1]  R. Major,et al.  Foreign accent , 2001 .

[2]  Y. R. Chao,et al.  A Grammar of Spoken Chinese , 1970 .

[3]  Kenneth R. Koedinger,et al.  Generalized learning factors analysis: improving cognitive models with machine learning , 2009 .

[4]  John Seely Brown,et al.  Intelligent Tutoring Systems , 2016, Lecture Notes in Computer Science.

[5]  Robert J. Mislevy,et al.  Automated scoring of complex tasks in computer-based testing , 2006 .

[6]  趙 元任,et al.  A grammar of spoken Chinese = 中國話的文法 , 1968 .

[7]  I. R. MacKay,et al.  PERCEIVING VOWELS IN A SECOND LANGUAGE , 2004, Studies in Second Language Acquisition.

[8]  Kenneth R. Koedinger,et al.  Performance Factors Analysis - A New Alternative to Knowledge Tracing , 2009, AIED.

[9]  C. Best A direct realist view of cross-language speech perception , 1995 .

[10]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[11]  San Duanmu,et al.  The Phonology of Standard Chinese , 2001 .

[12]  Johann Gamper,et al.  A Review of Intelligent CALL Systems , 2002 .

[13]  Vincent Aleven,et al.  Intelligent Tutoring Goes To School in the Big City , 1997 .

[14]  Albert T. Corbett,et al.  A Cognitive Tutor for Genetics Problem Solving: Learning Gains and Student Modeling , 2010 .

[15]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[16]  Stefan Kopp,et al.  Proceedings of the 37th Annual Conference of the Cognitive Science Society , 2013 .

[17]  Kenneth R. Koedinger,et al.  Knowledge tracing and cue contrast: Second language English grammar instruction , 2013, CogSci.

[18]  Trude Heift,et al.  Developing an Intelligent Language Tutor , 2010 .

[19]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[20]  P. Winne,et al.  Handbook of educational psychology , 2015 .

[21]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression , 2007, J. Mach. Learn. Res..

[22]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[23]  Trude Heift Modeling learner variability in CALL , 2008 .

[24]  J. Flege Second Language Speech Learning Theory , Findings , and Problems , 2006 .

[25]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[26]  Joseph E. Beck,et al.  Using Knowledge Tracing in a Noisy Environment to Measure Student Reading Proficiencies , 2006, Int. J. Artif. Intell. Educ..

[27]  Michael Heilman,et al.  Language Learning: Challenges for Intelligent Tutoring Systems , 2006 .

[28]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[29]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[30]  Kurt VanLehn,et al.  Instructional Factors Analysis: A Cognitive Model For Multiple Instructional Interventions , 2011, EDM.

[31]  Chris Shei,et al.  Linkit: a CALL system for learning Chinese characters, words, and phrases , 2012 .

[32]  W. Strange Speech perception and linguistic experience : issues in cross-language research , 1995 .

[33]  Lisa N. Michaud,et al.  An intelligent tutoring system for deaf learners of written English , 2000, Assets '00.