Scaling down: applying large vocabulary hybrid HMM-MLP methods to telephone recognition of digits and natural numbers

The hybrid hidden Markov model (HMM)/neural network (NN) speech recognition system at the International Computer Science Institute (ICSI) uses a single hidden layer multilayer perceptron (MLP) to compute emission probabilities of HMM states. This phoneme-based recognition approach was developed for large vocabulary size continuous speech recognition. In this paper, however, such a recognition scheme is applied directly to much smaller vocabulary size corpora, such as the Spoken Language Understanding Numbers'93 database and the TI connected digits. The authors report on the development of small baseline systems to facilitate all future research experiments, and also on the use of these systems for experiments in context-dependent hybrid HMM-MLP systems.