In this paper, we describe the BBN BYBLOS system used for the 1999 Hub-4E 10xRT evaluation benchmark, and discuss the improvements made to the system in 1999. We focus on the techniq ues that were new in this year’s system to achieve an optimal trad eoff between accuracy and speed for the evaluation benchmark test. Overall, we improved the recognition accuracy on the 1998 Hub-4E evaluation test by 14% relative to our 1998 10xRT system (from 17. 1% to 14.7%), or equivalently we sped up the 1998 Primary system 24 times (from 240xRT to 10xRT) while maintaining the same word error rate (14.7%). This progress was attributed to improveme nt in fast segmentation using dual-band and dual-gender phone-class models based on RASTA-normalized features, supervised MLLR adapt ation of band-limited models to real telephone training data, ada ptation between decoding passes, and various adaptation speedups.
[1]
Daben Liu,et al.
Fast speaker change detection for broadcast news transcription and indexing
,
1999,
EUROSPEECH.
[2]
Mark J. F. Gales,et al.
Maximum likelihood linear transformations for HMM-based speech recognition
,
1998,
Comput. Speech Lang..
[3]
Richard M. Schwartz,et al.
Single-tree method for grammar-directed search
,
1999,
1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[4]
Hynek Hermansky,et al.
RASTA processing of speech
,
1994,
IEEE Trans. Speech Audio Process..