Development of SRI’s 1997 Broadcast News Transcription System

This paper describes SRI’s 1997 broadcastnews transcription system used for the 1997 DARPA H4 evaluations. Our system had several novel components. These include automatic segmentation of entire broadcast shows, word-internal and crossword acoustic models robustly estimated with a new Gaussian Merging-Splitting (GMS) algorithm, the use of trigram language models (LMs) in lattices instead of for rescoring N-best lists, and an LM pruning algorithm that allows efficient representation of high-order (like 4or 5-gram) LMs. We briefly describe these features and give comparative experimental results. We achieved a 18.7% relative improvement in performance on our 1996 H4 partitioned evaluation (PE) development test set as compared to our 1996 H4 PE evaluation system.

[1]  Richard M. Schwartz,et al.  The 1996 BBN BYBLOS HUB-4 Transcription System , 1996 .

[2]  R. Schwartz,et al.  A comparison of several approximate algorithms for finding multiple (N-best) sentence hypotheses , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Chin-Hui Lee,et al.  A maximum-likelihood approach to stochastic matching for robust speech recognition , 1996, IEEE Trans. Speech Audio Process..

[4]  Mitch Weintraub,et al.  Explicit word error minimization in n-best list rescoring , 1997, EUROSPEECH.

[5]  Vassilios Digalakis,et al.  Genones: generalized mixture tying in continuous hidden Markov model-based speech recognizers , 1996, IEEE Trans. Speech Audio Process..

[6]  Mark J. F. Gales,et al.  The generation and use of regression class trees for MLLR adaptation , 1996 .

[7]  Andreas Stolcke,et al.  New Developments in Lattice-Based Search Strategies in SRI’s Hub4 System , 2008 .

[8]  Vassilios Digalakis,et al.  Speaker adaptation using constrained estimation of Gaussian mixtures , 1995, IEEE Trans. Speech Audio Process..

[9]  Vassilios Digalakis,et al.  A comparative study of speaker adaptation techniques , 1995, EUROSPEECH.

[10]  Ananth Sankar Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition , 2007 .

[11]  Larry P. Heck,et al.  Acoustic clustering and adaptation for robust speech recognition , 1997, EUROSPEECH.

[12]  Andreas Stolcke,et al.  Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System , 1997 .

[13]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[14]  A. Stolcke,et al.  NOISE-RESISTANT FEATURE EXTRACTION AND MODEL TRAINING FOR ROBUST SPEECH RECOGNITION , 1996 .