论文信息 - Compression of exception lexicons for small footprint grapheme-to-phoneme conversion

Compression of exception lexicons for small footprint grapheme-to-phoneme conversion

We present a method to reduce the memory footprint of a grapheme-to-phoneme conversion (G2P) module, without sacrificing accuracy. Since the G2P module is typically not 100% correct, it is common to augment the system with an exception lexicon - a list of words which the G2P does not handle correctly (and for which we require correct pronunciations), along with their corrected pronunciation. Since the size of the exception lexicon is one of the major limiting factors in reducing the overall size of the G2P module, we try to compress the exception lexicon. We suggest a novel compression method which is closely tied to the G2P conversion method. The idea behind this compression is that, even for words which are not transduced correctly, the decision trees generate a phonetic transcription which is close to the correct one. Therefore, it is sufficient to store only the correction in the exception lexicon. The correction information is represented in terms of corrections to the transduction process; it is thus able to take advantage of the knowledge gained from the training data regarding the probabilities of different corrections, and is used to obtain more efficient compression. An experiment showed that, by using this method, an exception pronunciation can be represented, on average, with less than 4 bits (a compression factor of 7, compared to the baseline representation).

Joram Meron | Peter Veprek

[1] Roland Kuhn,et al. Automatic methods for lexical stress assignment and syllabification , 2000, INTERSPEECH.

[2] Elmar Nöth,et al. Comparison of two tree-structured approaches for grapheme-to-phoneme conversion , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3] J. Meron. Using rules to improve letter to sound conversion of names , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..

[4] Roland Kuhn,et al. Rescoring multiple pronunciations generated from spelled words , 1998, ICSLP.

[5] Robert I. Damper,et al. A pronunciation-by-analogy module for the Festival Text-to-Speech Synthesiser , 2001, SSW.

[6] Alan W. Black,et al. Letter to sound rules for accented lexicon compression , 1998, ICSLP.