论文信息 - Discriminative semi-supervised training for keyword search in low resource languages

Discriminative semi-supervised training for keyword search in low resource languages

In this paper, we investigate semi-supervised training for low resource languages where the initial systems may have high error rate (≥ 70.0% word eror rate). To handle the lack of data, we study semi-supervised techniques including data selection, data weighting, discriminative training and multilayer perceptron learning to improve system performance. The entire suite of semi-supervised methods presented in this paper was evaluated under the IARPA Babel program for the keyword spotting tasks. Our semi-supervised system had the best performance in the OpenKWS13 surprise language evaluation for the limited condition. In this paper, we describe our work on the Turkish and Vietnamese systems.

[1] Jing Huang,et al. Multi-View and Multi-Objective Semi-Supervised Learning for HMM-Based Automatic Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[2] Richard M. Schwartz,et al. Score normalization and system combination for improved keyword spotting , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[3] Martin Karafi. BUT BABEL system for spontaneous Cantonese , 2013 .

[4] Martin Karafiát,et al. Semi-supervised bootstrapping approach for neural network feature extractor training , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[5] Richard M. Schwartz,et al. Unsupervised versus supervised training of acoustic models , 2008, INTERSPEECH.

[6] George Saon,et al. Penalty function maximization for large margin HMM training , 2008, INTERSPEECH.

[7] Mark J. F. Gales,et al. Discriminative map for acoustic model adaptation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[8] Mark J. F. Gales,et al. Unsupervised training and directed manual transcription for LVCSR , 2010, Speech Commun..

[9] Tanja Schultz,et al. Generalized Baum-Welch Algorithm and its Implication to a New Extended Baum-Welch Algorithm , 2011, INTERSPEECH.

[10] Spyridon Matsoukas,et al. Developing a Speech Activity Detection System for the DARPA RATS Program , 2012, INTERSPEECH.

[11] Spyridon Matsoukas,et al. Region Dependent Transform on MLP Features for Speech Recognition , 2011, INTERSPEECH.

[12] Alexander Gruenstein,et al. Unsupervised Testing Strategies for ASR , 2011, INTERSPEECH.

[13] Hermann Ney,et al. Unsupervised training of acoustic models for large vocabulary continuous speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[14] Mark J. F. Gales,et al. Unsupervised Training for Mandarin Broadcast News and Conversation Transcription , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[15] Daniel Povey,et al. Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16] Brian Kingsbury,et al. Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17] Jan Cernocký,et al. BUT BABEL system for spontaneous Cantonese , 2013, INTERSPEECH.