Genre Categorization and Modeling for Broadcast Speech Transcription

Broadcast News (BN) speech recognition transcription has attracted research due to the challenges of the task since the mid 1990’s. More recently, research has been moving towards more spontaneous broadcast data, commonly called Broadcast Conversation (BC) speech. Considering the large style difference between BN and BC genres, specific modeling of genres should intuitively result in improved system performance. In this paper BNand BC-style speech recognition has been explored by designing genre-specific systems. In order to separate the training data, an automatic genre categorization with two novel features is proposed. Experiments showed that automatic categorization of genre labels of the training data compared favorably to the original manually specified genre labels provided with corpora. When test data sets were classified into BN or BC genres and tested by the corresponding genre-specific speech recognition systems, modest but consistent error reductions were achieved compared to the baseline genre-independent systems.

[1]  Long Nguyen,et al.  Progress in the BBN 2007 Mandarin Speech to Text system , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Wen Wang,et al.  A comparative large scale study of MLP features for Mandarin ASR , 2010, INTERSPEECH.

[3]  Andreas Stolcke,et al.  Multifactor adaptation for Mandarin broadcast news and conversation speech recognition , 2009, INTERSPEECH.

[4]  Mark J. F. Gales,et al.  Language model cross adaptation for LVCSR system combination , 2013, Comput. Speech Lang..

[5]  Andreas Stolcke,et al.  Using MLP features in SRI's conversational speech recognition system , 2005, INTERSPEECH.

[6]  Jean-Luc Gauvain,et al.  Transcribing broadcast data using MLP features , 2008, INTERSPEECH.

[7]  Jean-Luc Gauvain,et al.  Improved models for Mandarin speech-to-text transcription , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Jean-Luc Gauvain,et al.  Improving Mandarin Chinese STT system with Random Forests language models , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.