论文信息 - Speech recognition on Mandarin Call Home: a large-vocabulary, conversational, and telephone speech corpus

Speech recognition on Mandarin Call Home: a large-vocabulary, conversational, and telephone speech corpus

We describe IBM's most recent efforts for speech recognition on a conversational-speech database, the Mandarin Call Home corpus. While it is similar to the well-known Switchboard corpus, the Call Home task addresses several major challenges in the domain of spoken language systems, including spontaneous dialogue with no pre-specified topics, limited-bandwidth telephone signal, and recognition of other languages than English. We particularly describe the methodology used in Mandarin Call Home corpus to address language-specific issues. We also examine and compare our results with those of the English Switchboard corpus. Preliminary experiments show that a 58.7% character error rate can be achieved in the context of April 95 Mandarin Call Home data set. The experimental results are comparable to those of the state-of-the-art IBM Switchboard system with similar amount of training data.

Michael Picheny | Fu-Hua Liu | Michael D. Monkowski | Patibandla Srinivasa | C. Julian Chen

[1] Herbert Gish,et al. Reducing word error rate on conversational speech from the Switchboard corpus , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[2] Bruce M. Lairson,et al. Reducing Intergranular Magnetic Coupling by Incorporating Carbon into Co/Pd Multilayers , 1995 .

[3] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[5] Chiu-yu Tseng,et al. Golden Mandarin (I)-A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary , 1993, IEEE Trans. Speech Audio Process..

[6] Michael Picheny,et al. Performance of the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[7] Michael Picheny,et al. Decision trees for phonological rules in continuous speech , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.