The use of syllable phonotactics for word hypothesization

A search technique incorporating the automatic modeling of lexical variability is introduced for medium or large-vocabulary speaker-independent speech recognition. Current state-of-art systems depend on being able to model the entire language based on acoustic features and the constraints of syntax or inter-word probabilities. These methods often fail in the presence of multiple speakers, new vocabulary, noise, and spontaneous speech phenomena. A new approach for word hypothesization is proposed, based on an acoustic-phonetic unit called the pseudo-syllable segment. An algorithm is described for transforming a sequence of syllables into words. Techniques are suggested for controlling the accuracy of the syllabic hypothesis set, and learning the phonotactics of syllables automatically in a statistical framework.

[1]  Mark M. Thomson,et al.  Statistical modeling of speech feature vector trajectories based on a piecewise continuous mean path , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Giorgio Satta,et al.  Optimal Probabilistic Evaluation Functions for Search Controlled by Stochastic Context-Free Grammars , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Fernando Pereira,et al.  The AT&t 60,000 word speech-to-text system , 1995, EUROSPEECH.

[4]  Michael Galler,et al.  On the use of stochastic inference networks for representing multiple word pronunciations , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Renato De Mori,et al.  Search and learning strategies for improving hidden Markov models , 1995, Comput. Speech Lang..