Speech Recognition by Goats, Wolves, Sheep and Non-Natives

Abstract : This paper gives an overview of current understanding of acoustic-phonetic issues arising when trying to recognize speech from non-native speakers. Regional accents can be modeled by systematic shifts in pronunciation. These can often better be represented by multiple models, than by pronunciation variants in the dictionary. The problem of non-native speech is much more difficult because it is influenced both by native and spoken language, making a multi-model approach inappropriate. It is also characterized by a much higher speaker variability due to different levels of proficiency. A few language-pair specific rules describing the prototyical nativised pronunciation was found to be useful both in general speech recognition as in dedicated applications. However, due to the nature of the errors and the mappings, non-native speech recognition will remain inherently much harder. Moreover, the trend in speech recognition towards more detailed modeling is counterproductive for the recognition of non-natives.

[1]  Lou Boves,et al.  Assessment of dutch pronunciation by means of automatic speech recognition technology , 1998, ICSLP.

[2]  J E Flege,et al.  The perception of English and Spanish vowels by native English and Spanish listeners: a multidimensional scaling analysis. , 1995, The Journal of the Acoustical Society of America.

[3]  Keikichi Hirose,et al.  A method for measuring the intelligibility and nonnativeness of phone quality in foreign language pronunciation training , 1998, ICSLP.

[4]  William J. Byrne,et al.  Stochastic pronunciation modelling from hand-labelled phonetic corpora , 1999, Speech Commun..

[5]  Jean-Pierre Martens,et al.  In Search of Pronunciation Rules , 1998 .

[6]  Joachim Köhler,et al.  Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  W. J. Barry,et al.  An approach to the problem of regional accent in automatic speech recognition , 1989 .

[8]  Mitch Weintraub,et al.  Automatic text-independent pronunciation scoring of foreign language student speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Y. Patel,et al.  An integrated multi-dialect speech recognition system with optional speaker adaptation , 1995, EUROSPEECH.

[10]  Lotfi A. Zadeh,et al.  Phonological structures for speech recognition , 1989 .

[11]  Patrizia Bonaventura,et al.  Multilingual speech recognition for flexible vocabularies , 1997, EUROSPEECH.

[12]  Isabel Trancoso,et al.  On deriving rules for nativised pronunciation in navigation queries , 1999, EUROSPEECH.

[13]  R. Schwartz,et al.  Maximum a posteriori adaptation for large scale HMM recognizers , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[14]  Christoph Draxler,et al.  Identification of regional variants of high German from digit sequences in German telephone speech , 1997, EUROSPEECH.