A turbo-style algorithm for lexical baseforms estimation

In this research, an iterative and unsupervised Turbo-style algorithm is presented and implemented for the task of automatic lexical acquisition. The algorithm makes use of spoken examples of both spellings and words and fuses information from letter and subword recognizers to boost the overall lexical learning performance. The algorithm is tested on a challenging lexicon of restaurant and street names and evaluated in terms of spelling accuracy and letter error rate. Absolute improvements of 7.2% and 3% (15.5% relative improvement) are obtained in the spelling accuracy and the letter error rate respectively following only 2 iterations of the algorithm.

[1]  Hermann Ney,et al.  Open vocabulary speech recognition with flat hybrid models , 2005, INTERSPEECH.

[2]  James R. Glass,et al.  Automatic lexical pronunciations generation and update , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[3]  Stephanie Seneff Reversible Sound-to-Letter/Letter-to-Sound Modeling Based on Syllable Structure , 2007, HLT-NAACL.

[4]  A. Kellner,et al.  Strategies for name recognition in automatic directory assistance systems , 1998, Proceedings 1998 IEEE 4th Workshop Interactive Voice Technology for Telecommunications Applications. IVTTA '98 (Cat. No.98TH8376).

[5]  James F. Allen,et al.  Pronunciation of proper names with a joint n-gram model for bi-directional grapheme-to-phoneme conversion , 2002, INTERSPEECH.

[6]  A. Glavieux,et al.  Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1 , 1993, Proceedings of ICC '93 - IEEE International Conference on Communications.

[7]  Hauke Schramm,et al.  Strategies for name recognition in automatic directory assistance systems , 2000, Speech Commun..

[8]  Victor Zue,et al.  JUPlTER: a telephone-based conversational interface for weather information , 2000, IEEE Trans. Speech Audio Process..

[9]  James R. Glass A probabilistic framework for segment-based speech recognition , 2003, Comput. Speech Lang..

[10]  Stephanie Seneff,et al.  Developing City Name Acquisition Strategies in Spoken Dialogue Systems Via User Simulation , 2005, SIGDIAL.

[11]  Edward Filisko,et al.  Developing attribute acquisition strategies in spoken dialogue systems via user simulation , 2006 .

[12]  Michael Picheny,et al.  Automatic phonetic baseform determination , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[13]  James R. Glass,et al.  Unsupervised Word Acquisition from Speech using Pattern Discovery , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.