Automatic conversion between pronunciations of different English accents

We describe the application of decision trees to the automatic conversion of pronunciations between American, British and South African English accents. The resulting phoneme-to-phoneme (P2P) conversion technique derives the pronunciation of a word in a new target accent by taking advantage of its existing available pronunciation in a different source accent. We find that it is substantially more accurate to derive pronunciations in this way than directly from the orthography and available target accent pronunciations using more conventional grapheme-to-phoneme (G2P) conversion. Furthermore, by including both the graphemes and the phonemes of the source accent, grapheme-and-phoneme-to-phoneme (GP2P) conversion delivers additional increases in accuracy in relation to P2P. These findings are particularly important for less-resourced varieties of English, for which extensive manually-prepared pronunciation dictionaries are not available. By means of the P2P and GP2P approaches, the pronunciations of new words can be obtained with better accuracy than is possible using G2P methods.

[1]  Alan W. Black,et al.  Letter to sound rules for accented lexicon compression , 1998, ICSLP.

[2]  Rajend Mesthrie,et al.  A Handbook of Varieties of English , 2004 .

[3]  I. Bekker The vowels of South African English , 2008 .

[4]  John C. Wells,et al.  Accents of English , 1982 .

[5]  Gui-Lin Chen,et al.  Letter-to-sound for small-footprint multilingual TTS engine , 2004, INTERSPEECH.

[6]  Robert I. Damper,et al.  A multistrategy approach to improving pronunciation by analogy , 2000, CL.

[7]  Paul Taylor,et al.  Hidden Markov models for grapheme to phoneme conversion , 2005, INTERSPEECH.

[8]  Walter Daelemans,et al.  Forgetting Exceptions is Harmful in Language Learning , 1998, Machine Learning.

[9]  Jean-Pierre Martens,et al.  G2p conversion of names: what can we do (better)? , 2007, INTERSPEECH.

[10]  Thomas Niesler,et al.  Phonetic analysis of Afrikaans, English, Xhosa and Zulu using South African speech databases , 2005 .

[11]  Edgar W. Schneider,et al.  White South African English: phonology , 2008 .

[12]  Julia Hirschberg,et al.  Progress in speech synthesis , 1997 .

[13]  Thomas Hain,et al.  Bob: A lexicon and pronunciation dictionary generator , 2008, 2008 IEEE Spoken Language Technology Workshop.

[14]  Robert I. Damper,et al.  Evaluating the pronunciation component of text-to-speech systems for English: a performance comparison of different approaches , 1999, Comput. Speech Lang..

[15]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[16]  Walter Daelemans,et al.  Language-Independent Data-Oriented Grapheme-to-Phoneme Conversion , 1996 .

[17]  Philip C. Woodland,et al.  Using accent-specific pronunciation modelling for robust speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[18]  Thomas Niesler,et al.  Data-driven phonetic comparison and conversion between south african, british and american English pronunciations , 2009, INTERSPEECH.

[19]  Vera Demberg,et al.  Phonological Constraints and Morphological Preprocessing for Grapheme-to-Phoneme Conversion , 2007, ACL.

[20]  Hermann Ney,et al.  Bootstrap estimates for confidence intervals in ASR performance evaluation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[21]  Juha Häkkinen,et al.  Decision tree based text-to-phoneme mapping for speech recognition , 2000, INTERSPEECH.

[22]  Kari Torkkola An efficient way to learn English grapheme-to-phoneme rules automatically , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  Robert I. Damper,et al.  Aligning letters and phonemes for speech synthesis , 2004, SSW.

[24]  Elmar Nöth,et al.  Comparison of two tree-structured approaches for grapheme-to-phoneme conversion , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[25]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[26]  Norbert Braunschweiler,et al.  An evaluation of non-standard features for grapheme-to-phoneme conversion , 2008, INTERSPEECH.

[27]  Alan W. Black,et al.  Issues in building general letter to sound rules , 1998, SSW.

[28]  Reinhard Kneser,et al.  Designing very compact decision trees for grapheme-to-phoneme transcription , 2001, INTERSPEECH.