A French Non-Native Corpus for Automatic Speech Recognition

Automatic speech recognition (ASR) technology has achieved a level of maturity, where it is already practical to be used by novice users. However, most non-native speakers are still not comfortable with services including ASR systems, because of the accuracy on non-native speakers. This paper describes our approach in constructing a non-native corpus particularly in French for testing and adapting non-native speaker for automatic speech recognition. Finally, we also propose in this paper a method for detecting pronunciation variants and possible pronunciation mistakes by non-native speakers.

[1]  Jean-François Serignat,et al.  Spoken and Written Language Resources for Vietnamese , 2004, LREC.

[2]  Elmar Nöth,et al.  Adaptation in the pronunciation space for non-native speech recognition , 2004, INTERSPEECH.

[3]  Laurent Besacier,et al.  First steps in fast acoustic modeling for a new target language: application to Vietnamese , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  San Duanmu,et al.  The Phonology of Standard Chinese , 2001 .

[5]  Maxine Eskénazi,et al.  BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.

[6]  Silke Goronzy,et al.  Robust Adaptation to Non-Native Accents in Automatic Speech Recognition , 2002, Lecture Notes in Computer Science.

[7]  Kunkel Jm,et al.  Spontaneous subclavain vein thrombosis: a successful combined approach of local thrombolytic therapy followed by first rib resection. , 1989 .

[8]  Manuela Boros,et al.  Recognition of non-native German speech with multilingual recognizers , 1999, EUROSPEECH.

[9]  Tanja Schultz,et al.  Non-native spontaneous speech recognition through polyphone decision tree specialization , 2003, INTERSPEECH.

[10]  G. Clark,et al.  Reference , 2008 .

[11]  Dominique Vaufreydaz,et al.  A New Methodology for Speech Corpora Definition from Internet Documents , 2000, LREC.

[12]  James R. Glass,et al.  Lexical modeling of non-native speech for automatic speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[13]  Steve J. Young,et al.  Off-line acoustic modelling of non-native accents , 1999, EUROSPEECH.