Assessing Pronunciation Improvement in Students of English Using a Controlled Computer-Assisted Pronunciation Tool

Over the last few years, we have witnessed a growing interest in computer-assisted pronunciation training (CAPT) tools and the commercial success of foreign language teaching applications that incorporate speech synthesis and automatic speech recognition technologies. However, empirical evidence supporting the pedagogical effectiveness of these systems remains scarce. In this article, a minimal-pair-based CAPT tool that implements exposure–perception–production cycles and provides automatic feedback to learners is tested for effectiveness in training adult native Spanish users (English level B1–B2) in the production of a set of difficult English sounds. Working under controlled conditions, a group of users took a pronunciation test before and after using the tool. Test results were considered against those of an in-classroom group who followed similar training within the traditional classroom setting. Results show a significant pronunciation improvement among the learners who used the CAPT tool, as well as a correlation between human rater's assessment of posttests and automatic CAPT assessment of users.

[1]  Gillian Lord,et al.  Podcasting Communities and Second Language Pronunciation , 2008 .

[2]  Demmans Epp,et al.  Protutor : a pronunciation tutor that uses historic open learner models , 2010 .

[3]  Yukari Hirata,et al.  Computer Assisted Pronunciation Training for Native English Speakers Learning Japanese Pitch and Durational Contrasts , 2004 .

[4]  Deborah Burleson Training segmental productions for second language intelligibility , 2007 .

[5]  Ron I. Thomson,et al.  Improving L2 Listeners’ Perception of English Vowels: A Computer-Mediated Approach , 2012 .

[6]  Mounira El Tatawy Corrective Feedback in Second Language Acquisition , 2002 .

[7]  J. D. O'Connor,et al.  Sounds English: a pronunciation practice book , 1989 .

[8]  Debra M. Hardison,et al.  Generalization of Computer Assisted Prosody Training: Quantitative and Qualitative Findings , 2004 .

[9]  Yik-Cheung Tam,et al.  PLASER: Pronunciation Learning via Automatic Speech Recognition , 2003, HLT-NAACL 2003.

[10]  David S. Rood,et al.  Pronunciation Pairs: An Introductory Course for Students of English , 1990 .

[11]  Vassilios Digalakis,et al.  Automatic pronunciation evaluation of foreign speakers using unknown text , 2007, Comput. Speech Lang..

[12]  Amin Nezarat,et al.  Mobile-Assisted Language Learning , 2012 .

[13]  Gillian Lord,et al.  (How) Can We Teach Foreign Language Pronunciation? On the Effects of a Spanish Phonetics Course , 2005 .

[14]  Mike Levy,et al.  Technologies in Use for Second Language Learning , 2009 .

[15]  Jens Edlund,et al.  Promoting Increased Pitch Variation in Oral Presentations with Transient Visual Feedback. , 2009 .

[16]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[17]  Hyunsook Yoon,et al.  More than a linguistic reference: The influence of corpus technology on L2 academic writing , 2008 .

[18]  Melissa M. Landon,et al.  The Effects of Computer-assisted Pronunciation Readings on ESL Learners’ Use of Pausing, Stress, Intonation, and Overall Comprehensibility , 2009 .

[19]  P. Zepeda,et al.  The teaching of pronunciation , 2010 .

[20]  Ron I. Thomson Computer Assisted Pronunciation Training: Targeting Second Language Vowel Perception Improves Pronunciation. , 2011 .

[21]  Zöe Handley Is text-to-speech synthesis ready for use in computer-assisted language learning? , 2009, Speech Commun..

[22]  Yu Zhang,et al.  Very deep convolutional networks for end-to-end speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Diego Giuliani,et al.  The effectiveness of computer assisted pronunciation training for foreign language learning by children , 2008 .

[24]  Serguei V. S. Pakhomov,et al.  Forced-Alignment and Edit-Distance Scoring for Vocabulary Tutoring Applications , 2008, TSD.

[25]  Ji-Yeon Lee The effects of pronunciation instruction using duration manipulation on the acquisition of English vowel sounds by pre-service Korean EFL teachers , 2009 .

[26]  David Escudero Mancebo,et al.  Implementation and test of a serious game based on minimal pairs for pronunciation training , 2015, SLaTE.

[27]  David Escudero Mancebo,et al.  Measuring Pronunciation Improvement in Users of CAPT Tool TipTopTalk! , 2016, INTERSPEECH.

[28]  Helmer Strik,et al.  The effectiveness of computer-based speech corrective feedback for improving segmental quality in L2 Dutch , 2008, ReCALL.

[29]  Gonca Yangin,et al.  An Investigation of the Effectiveness of Online Text-to-Speech Tools in Improving EFL Teacher Trainees' Pronunciation , 2016 .

[30]  Elizabeth M. Kissling Teaching pronunciation: Is explicit phonetics instruction beneficial for FL learners? , 2013 .

[31]  P. Kerswill,et al.  New towns and koineization: linguistic and social correlates , 2005 .

[32]  Forbes Ave. Pittsburgh,et al.  PINPOINTING PRONUNCIATION ERRORS IN CHILDREN ’ S SPEECH : EXAMINING THE ROLE OF THE SPEECH RECOGNIZER , 2000 .

[33]  Rebecca Hincks Speech technologies for pronunciation feedback and evaluation , 2003, ReCALL.

[34]  Valentín Cardeñoso-Payo,et al.  Improving L2 production with a gamified computer-assisted pronunciation training tool, TipTopTalk! , 2016 .

[35]  Tracey M. Derwing,et al.  The Effectiveness of L2 Pronunciation Instruction: A Narrative Review , 2015 .

[36]  Maxine Eskénazi,et al.  An empirical study of the effectiveness of speech-recognition-based pronunciation training , 2000, INTERSPEECH.

[37]  David Escudero Mancebo,et al.  Tiptoptalk!: A game to improve the perception and production of L2 sounds , 2016 .

[38]  Karim Nader,et al.  Reconsolidation and the Dynamic Nature of Memory. , 2015, Cold Spring Harbor perspectives in biology.

[39]  Alysse Weinberg,et al.  Learning French Pronunciation: Audiocassettes or Multimedia?. , 2013 .

[40]  Luke Plonsky,et al.  The Effectiveness of Second Language Pronunciation Instruction: A Meta-analysis , 2015 .

[41]  Enrique Cámara-Arenas,et al.  The NCM and the Reprogramming of Latent Phonological Systems: A Bilingual Approach to the Teaching of English Sounds to Spanish Students☆ , 2014 .

[42]  Mark J. F. Gales,et al.  Towards Using Conversations with Spoken Dialogue Systems in the Automated Assessment of Non-Native Speakers of English , 2016, SIGDIAL Conference.

[43]  D. Pisoni,et al.  Training Japanese listeners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production. , 1997, The Journal of the Acoustical Society of America.

[44]  Dorothy M. Chun VISUALIZATION OF TONE FOR LEARNING MANDARIN CHINESE , 2013 .

[45]  Maria Camino Bueno Alastuey,et al.  Synchronous-Voice Computer-Mediated Communication: Effects on Pronunciation , 2010 .

[46]  Hideki Kawahara,et al.  Computer-based second language production training by using spectrographic representation and HMM-based speech recognition scores , 1998, ICSLP.

[47]  Debra M. Hardison Contextualized Computer-based L2 Prosody Training: Evaluating the Effects of Discourse Context and Video Input , 2013 .

[48]  Julie L. Moore Focus on Form , 2011 .

[49]  Cristian Tejedor García,et al.  Playing around Minimal Pairs to improve pronunciation training , 2015 .

[50]  R. L. Trask,et al.  语音学和音系学词典 = A dictionary of phonetics and phonology , 1993 .

[51]  Tracey M. Derwing,et al.  Second Language Accent and Pronunciation Teaching: A Research- Based Approach. , 2005 .

[52]  Julie Maitland,et al.  Hidden in plain sight: low-literacy adults in a developed country overcoming social and educational challenges through mobile learning support tools , 2014, Personal and Ubiquitous Computing.

[53]  E. Vajda Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet , 2000 .

[54]  Helmer Strik,et al.  The Pedagogy-Technology Interface in Computer Assisted Pronunciation Training , 2002 .

[55]  Tara N. Sainath,et al.  A Comparison of Sequence-to-Sequence Models for Speech Recognition , 2017, INTERSPEECH.

[56]  Xinchun Wang Training Mandarin and Cantonese speakers to identify English vowel contrasts: Long‐term retention and effect on production , 2000 .

[57]  Navdeep Jaitly,et al.  Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[58]  Walcir Cardoso,et al.  The pedagogical use of mobile speech synthesis (TTS): focus on French liaison , 2017 .

[59]  Khalid Al-Seghayer,et al.  THE EFFECT OF MULTIMEDIA ANNOTATION MODES ON L2 VOCABULARY ACQUISITION: A COMPARATIVE STUDY , 2001 .

[60]  Eli Hinkel,et al.  Corrective Feedback in Language Teaching , 2011 .

[61]  Rebecca Hincks,et al.  Computer support for learners of spoken English , 2005 .

[62]  Tatsuya Kawahara,et al.  Automatic Speech Recognition Errors as a Predictor of L2 Listening Difficulties , 2016, CL4LC@COLING 2016.

[63]  Nancy Clarke Guilloteau Modification of phonetic categories in French as a second language: Experimental studies with conventional and computer-based intervention methods , 1997 .

[64]  Shannon Sauro,et al.  COMPUTER-MEDIATED CORRECTIVE FEEDBACK AND THE DEVELOPMENT OF L2 GRAMMAR , 2009 .

[65]  Walcir Cardoso,et al.  Learning L2 pronunciation with a mobile speech recognizer: French /y/ , 2014 .

[66]  Helmer Strik,et al.  Selecting segmental errors in non-native Dutch for optimal pronunciation training , 2006 .

[67]  Walcir Cardoso,et al.  An evaluation of text-to-speech synthesizers in the foreign language classroom: learners’ perceptions , 2016 .