论文信息 - Multiple task-domain acoustic models

Multiple task-domain acoustic models

Many speech recognition applications require the recognizer to perform at peak recognition accuracy across many different domains. Examples of different domains are general English, digits, names, alphabet, etc. Here we show a way to preserve the simplicity of a single acoustic model while providing domain specific recognition speed and accuracy. This is achieved by employing an extended phoneme set that keeps a subset of phonemes specifically for a particular domain, and a context dependency specification that allows cross-word, cross-domain phonetic context dependencies. Testing on a names recognition task going from a wrong domain (general English) model to a multiple domain model (general English, alphabet, names) the error rate is reduced by more than 50%. Domain-specific model trained only on the names data further reduces the error rate by more than 50%.

Andrej Ljolje

[1] M. D. Riley. Tree-based models for speech and language , 1994, Proceedings of 1994 Workshop on Information Theory and Statistics.

[2] Ramesh A. Gopinath,et al. Low-Resource Speech Recognition of 500-Word Vocabularies , 2001 .

[3] Gregory A. Sanders,et al. Darpa Communicator Evaluation: Progress from 2000 to 2001 Darpa Communicator Evaluation: Progress from 2000 to 2001 , 2022 .