论文信息 - Language and Pronunciation Modeling in the CMU 1996 Hub 4 Evaluation

Language and Pronunciation Modeling in the CMU 1996 Hub 4 Evaluation

We describe several language and pronunciation modeling techniques that were applied to the 1996 Hub 4 Broadcast News transcription task. These include topic adaptation, the use of remote corpora, vocabulary size optimization, n-gram cutoff optimization, modeling of spontaneous speech, handling of unknown linguistic boundaries, higher order n-grams, weight optimization in rescoring, and lexical modeling of phrases and acronyms.

Stanley F. Chen | Maxine Eskenazi | Kristie Seymore

[1] G Salton,et al. Developments in Automatic Text Retrieval , 1991, Science.

[2] Ralph Grishman,et al. NYU Language Modeling Experiments for the 1995 CSR Evaluation , 1995 .

[3] Mari Ostendorf,et al. Integration of Diverse Recognition Methodologies Through Reevaluation of N-Best Sentence Hypotheses , 1991, HLT.

[4] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5] Anthony J. Robinson,et al. Language model adaptation using mixtures and an exponentially decaying cache , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[7] Mari Ostendorf,et al. Modeling long distance dependence in language: topic mixtures vs. dynamic cache models , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[8] Ronald Rosenfeld,et al. Optimizing lexical and N-gram coverage via judicious use of linguistic data , 1995, EUROSPEECH.