Language technologies in speech-enabled second language learning games: from reading to dialogue

Second language learning has become an important societal need over the past decades. Given that the number of language teachers is far below demand, computer-aided language learning software is becoming a promising supplement to traditional classroom learning, as well as potentially enabling new opportunities for self-learning. The use of speech technologies is especially attractive to offer students unlimited chances for speaking exercises. To create helpful and intelligent speaking exercises on a computer, it is necessary for the computer to not only recognize the acoustics, but also to understand the meaning and give appropriate responses. Nevertheless, most existing speech-enabled language learning software focuses only on speech recognition and pronunciation training. Very few have emphasized exercising the student's composition and comprehension abilities and adopting language technologies to enable free-form conversation emulating a real human tutor. This thesis investigates the critical functionalities of a computer-aided language learning system, and presents a generic framework as well as various language- and domain-independent modules to enable building complex speech-based language learning systems. Four games have been designed and implemented using the framework and the modules to demonstrate their usability and flexibility, where dynamic content creation, automatic assessment, and automatic assistance are emphasized. The four games, reading, translation, question-answering and dialogue, offer different activities with gradually increasing difficulty, and involve a wide range of language processing techniques, such as language understanding, language generation, question generation, context resolution, dialogue management and user simulation. User studies with real subjects show that the systems were well received and judged to be helpful. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  Oliver Lemon,et al.  Author manuscript, published in "European Conference on Speech Communication and Technologies (Interspeech'07), Anvers: Belgium (2007)" Machine Learning for Spoken Dialogue Systems , 2022 .

[2]  Gabriel Skantze,et al.  GALATEA: A Discourse Modeller Supporting Concept-Level Error Handling in Spoken Dialogue Systems , 2005, SIGDIAL.

[3]  Salim Roukos,et al.  A Flexible Framework for Developing Mixed-Initiative Dialog Systems , 2002, SIGDIAL Workshop.

[4]  Alexander I. Rudnicky,et al.  Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda , 2003, INTERSPEECH.

[5]  Yi Zhu,et al.  Collection of user judgments on spoken dialog system with crowdsourcing , 2010, 2010 IEEE Spoken Language Technology Workshop.

[6]  Tsukasa Hirashima,et al.  Automated Question Generation Methods for Intelligent English Learning Systems and its Evaluation , 2001 .

[7]  Anna Goldie CHATTER : a spoken language dialogue system for language learning applications , 2011 .

[8]  William I. Hallahan DECtalk Software: Text-to-Speech Technology and Implementation , 1995, Digit. Tech. J..

[9]  Stephanie Seneff,et al.  TINA: A Natural Language System for Spoken Language Applications , 1992, Comput. Linguistics.

[10]  Ian McGraw,et al.  A self-transcribing speech corpus: collecting continuous speech with an online educational game , 2009, SLaTE.

[11]  Roberto Pieraccini,et al.  User modeling for spoken dialogue system evaluation , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[12]  H. Cuayahuitl,et al.  Human-computer dialogue simulation using hidden Markov models , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[13]  Ian Frank,et al.  For a fistful of dollars: using crowd-sourcing to evaluate a spoken language CALL application , 2011, SLaTE.

[14]  Thierry Dutoit,et al.  A probabilistic framework for dialog simulation and optimal strategy learning , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  S. Wilson What Video Games Have to Teach Us about Learning and Literacy , 2006 .

[16]  Stephanie Seneff,et al.  GENESIS-II: a versatile system for language generation in conversational system applications , 2000, INTERSPEECH.

[17]  Wai Kit Lo,et al.  Implementation of an extended recognition network for mispronunciation detection and diagnosis in computer-assisted pronunciation training , 2009, SLaTE.

[18]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[19]  Stephanie Seneff,et al.  Automatic question generation and answer judging: a q&a game for language learning , 2009, SLaTE.

[20]  Joseph Polifroni,et al.  Organization, communication, and control in the GALAXY-II conversational system , 1999, EUROSPEECH.

[21]  Amir Najmi,et al.  An interactive dialog system for learning Japanese , 2000, Speech Commun..

[22]  Farzad Ehsani,et al.  Speech Technology in Computer-Assisted Language Learning: Strengths and Limitations of a New CALL Paradigm. , 1998 .

[23]  Oliver Lemon,et al.  Learning More Effective Dialogue Strategies Using Limited Dialogue Move Features , 2006, ACL.

[24]  Ian McGraw,et al.  The WAMI toolkit for developing, deploying, and evaluating web-accessible multimodal interfaces , 2008, ICMI '08.

[25]  Jenny Brusk,et al.  DEAL - a serious game for CALL practicing conversational skills in the trade domain , 2007, SLaTE.

[26]  Roberto Pieraccini,et al.  Learning dialogue strategies within the Markov decision process framework , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[27]  Maxine Eskénazi,et al.  LET's GO: improving spoken dialog systems for the elderly and non-natives , 2003, INTERSPEECH.

[28]  Alexander Gruenstein,et al.  Toward Widely-Available and Usable Multimodal Conversational Interfaces , 2009 .

[29]  Stephanie Seneff,et al.  Rainbow rummy: a web-based game for vocabulary acquisition using computer-directed speech , 2009, SLaTE.

[30]  Stephanie Seneff,et al.  A generic framework for building dialogue games for language learning: application in the flight domain , 2011, SLaTE.

[31]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[32]  Lan Wang,et al.  Decision Fusion for Improving Mispronunciation Detection Using Language Transfer Knowledge and Phoneme-Dependent Pronunciation Scoring , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[33]  Tatsuya Kawahara,et al.  Japanese CALL system based on dynamic question generation and error prediction for ASR , 2009, SLaTE.

[34]  Stephanie Seneff,et al.  Automatic Drug Side Effect Discovery from Online Patient-Submitted Reviews: Focus on Statin Drugs , 2011 .

[35]  Oliver Lemon,et al.  Combining Acoustic and Pragmatic Features to Predict Recognition Performance in Spoken Dialogue Systems , 2004, ACL.

[36]  Konrad Scheffler,et al.  Probabilistic simulation of human-machine dialogues , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[37]  Stephanie Seneff,et al.  A Spoken Translation Game for Second Language Learning , 2007, AIED.

[38]  Stephanie Seneff,et al.  Mandarin Language Understanding in Dialogue Context , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[39]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[40]  James R. Glass A probabilistic framework for segment-based speech recognition , 2003, Comput. Speech Lang..

[41]  Stephanie Seneff,et al.  Speech-enabled card games for incidental vocabulary acquisition in a foreign language , 2009, Speech Commun..

[42]  Victor Zue,et al.  GALAXY-II: a reference architecture for conversational system development , 1998, ICSLP.

[43]  Preben Wik Designing a virtual language tutor , 2004 .

[44]  Stephanie Seneff,et al.  An interactive interpretation game for learning Chinese , 2007, SLaTE.

[45]  Antoine Raux,et al.  Using Task-Oriented Spoken Dialogue Systems for Language Learning: Potential, Practical Applications and Challenges , 2004 .

[46]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[47]  Stephanie Seneff,et al.  Spoken Dialogue Systems for Language Learning , 2007, NAACL.

[48]  Jason D. Williams,et al.  Demonstration of AT&T “Let's Go”: A production-grade statistical spoken dialog system , 2010, 2010 IEEE Spoken Language Technology Workshop.

[49]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[50]  Stephanie Seneff,et al.  Response planning and generation in the MERCURY flight reservation system , 2002, Comput. Speech Lang..

[51]  Chiu-yu Tseng,et al.  Interaction of Lexical and Sentence Prosody in Taiwan L2 English , 2010 .

[52]  M. Hasegawa-Johnson,et al.  Automatic Fluency Assessment by Signal-Level Measurement of Spontaneous Speech , 2010 .

[53]  Victor Zue,et al.  JUPlTER: a telephone-based conversational interface for weather information , 2000, IEEE Trans. Speech Audio Process..

[54]  S. Rapoport,et al.  [About the teacher]. , 1997, Klinicheskaia meditsina.

[55]  Roberto Pieraccini,et al.  Automating spoken dialogue management design using machine learning: An industry perspective , 2008, Speech Commun..

[56]  Lin-Shan Lee,et al.  Virtual Chinese tutor (VCT) - a Chinese language pronunciation learning software , 2009, SLaTE.

[57]  Milica Gasic,et al.  Bayesian dialogue system for the Let's Go Spoken Dialogue Challenge , 2010, 2010 IEEE Spoken Language Technology Workshop.

[58]  Oliver Lemon,et al.  User Simulations for Context-Sensitive Speech Recognition in Spoken Dialogue Systems , 2009, EACL.

[59]  Maxine Eskénazi,et al.  An overview of spoken language technology for education , 2009, Speech Commun..

[60]  Stephanie Seneff,et al.  Mandarin Learning Using Speech and Language Technologies: A Translation Game in the Travel Domain , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[61]  Ian McGraw,et al.  A self-labeling speech corpus: collecting spoken words with an online educational game , 2009, INTERSPEECH.

[62]  I. Lee Hetherington The MIT finite-state transducer toolkit for speech and language processing , 2004, INTERSPEECH.

[63]  Alexander Gruenstein Response-Based Confidence Annotation for Spoken Dialogue Systems , 2008, SIGDIAL Workshop.

[64]  Stephanie Seneff,et al.  Speech-enabled Card Games for Language Learners , 2008, AAAI.

[65]  Stephanie Seneff,et al.  A dialogue system for accessing drug reviews , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[66]  Yushi Xu Combining Linguistics and Statistics for High-Quality Limited Domain English-Chinese Machine Translation , 2008 .

[67]  Keikichi Hirose,et al.  Improved structure-based automatic estimation of pronunciation proficiency , 2009, SLaTE.

[68]  Wang Ling,et al.  An agent based competitive translation game for second language learning , 2011, SLaTE.

[69]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.