论文信息 - Named entity tagged language models

Named entity tagged language models

We introduce named entity (NE) language modelling, a stochastic finite state machine approach to identifying both words and NE categories from a stream of spoken data. We provide an overview of our approach to NE tagged language model (LM) generation together with results of the application of such a LM to the task of out-of-vocabulary (OOV) word reduction in large vocabulary speech recognition. Using the Wall Street Journal and Broadcast News corpora, it is shown that the tagged LM was able to reduce the overall word error rate by 14%, detecting up to 70% of previously OOV words. We also describe an example of the direct tagging of spoken data with NE categories.

Steve Renals | Yoshihiko Gotoh | Gethin Williams

[1] Yorick Wilks,et al. University of Sheffield: description of the LaSIE system as used for MUC-6 , 1995, MUC.

[2] Yorick Wilks,et al. University of Sheffield: Description of the LaSIE System as Used for MUC-6 , 1995, MUC.

[3] Richard M. Schwartz,et al. Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[4] Anthony J. Robinson,et al. An application of recurrent nets to phone probability estimation , 1994, IEEE Trans. Neural Networks.

[5] Steve Renals,et al. Efficient evaluation of the LVCSR search space using the NOWAY decoder , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[6] Ralph Weischedel,et al. NAMED ENTITY EXTRACTION FROM SPEECH , 1998 .