Using an ASR database to design a pronunciation evaluation system in Basque

This paper presents a method to build CAPT systems for under resourced languages, as Basque, using a general purpose ASR speech database. More precisely, the proposed method consists in automatically determine the threshold of GOP (Goodness Of Pronunciation) scores, which have been used as pronunciation scores in phone-level. Two score distributions have been obtained for each phoneme corresponding to its correct and incorrect pronunciations. The distribution of the scores for erroneous pronunciation has been calculated inserting controlled errors in the dictionary, so that each changed phoneme has been randomly replaced by a phoneme from the same group. These groups have been obtained by means of a phonetic clustering performed using regression trees. After obtaining both distributions, the EER (Equal Error Rate) of each distribution pair has been calculated and used as a decision threshold for each phoneme. The results show that this method is useful when there is no database specifically designed for CAPT systems, although it is not as accurate as those specifically designed for this purpose.

[1]  Yoon Kim,et al.  Automatic pronunciation scoring of specific phone segments for language instruction , 1997, EUROSPEECH.

[2]  Maxine Eskénazi,et al.  Detection of foreign speakers' pronunciation errors for second language training-preliminary results , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3]  Yik-Cheung Tam,et al.  PLASER: Pronunciation Learning via Automatic Speech Recognition , 2003, HLT-NAACL 2003.

[4]  Oliver Jokisch,et al.  The use of CALL in acquiring foreign language pronunciation and prosody - General specifications for Euronounce Project , 2009 .

[5]  Steve J. Young,et al.  Phone-level pronunciation scoring and assessment for interactive language learning , 2000, Speech Commun..

[6]  Grazyna Demenko,et al.  Applying speech and language technology to foreign language education , 2009, 2009 International Multiconference on Computer Science and Information Technology.

[7]  Vassilios Digalakis,et al.  Combination of machine scores for automatic grading of pronunciation quality , 2000, Speech Commun..

[8]  Horacio Franco,et al.  Automatic detection of mispronunciation for language instruction , 1997, EUROSPEECH.

[9]  Mervyn A. Jack,et al.  SPELL: An automated system for computer-aided pronunciation teaching , 1993, Speech Commun..

[10]  Jon Sánchez,et al.  The basque speech_dat (II) database: a description and first test recognition results , 2003, INTERSPEECH.

[11]  Helmer Strik,et al.  The goodness of pronunciation algorithm: a detailed performance study , 2009, SLaTE.

[12]  Grazyna Demenko,et al.  The EURONOUNCE corpus of non-native Polish for ASR-based pronunciation tutoring system , 2009, SLaTE.

[13]  Catherine L. Rogers,et al.  Intelligibility training for foreign‐accented speech: A preliminary study , 1994 .

[14]  Keikichi Hirose,et al.  A CALL system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruents , 1997, EUROSPEECH.

[15]  Ryohei Nakatsu,et al.  Automatic evaluation of English pronunciation based on speech recognition techniques , 1989, EUROSPEECH.