BUT Zero-Cost Speech Recognition 2016 System Description

This paper describes our work on developing speech recognizers for Vietnamese. It focuses on procedures to prepare provided data precisely. We aim on analysis of the textual transcriptions in particular. Methods to filter out defective data to improve performance of final system are proposed and described in detail. We also propose cleaning of other textual data used for language modeling. Several architectures are investigated to reach both sub-tasks goals. The achieved results are discussed.

[1]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[2]  Jan Cernocký,et al.  Multilingual BLSTM and speaker-specific vector adaptation in 2016 but babel system , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[3]  Lukás Burget,et al.  Variational Inference for Acoustic Unit Discovery , 2016, Workshop on Spoken Language Technologies for Under-resourced Languages.

[4]  Xavier Anguera Miró,et al.  Zero-Cost Speech Recognition Task at Mediaeval 2016 , 2016, MediaEval.

[5]  James R. Glass,et al.  A Nonparametric Bayesian Approach to Acoustic Model Discovery , 2012, ACL.

[6]  Jan Cernocký,et al.  But neural network features for spontaneous Vietnamese in BABEL , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).