论文信息 - An overview: Context-dependent acoustic modeling for LVCSR

An overview: Context-dependent acoustic modeling for LVCSR

Automatic speech recognition (ASR) is used for accurate and efficient conversion of speech signal into a text message. Generally, speech signal is taken as input and it is processed at front-end to extract features and then computed at back-end using the GMM model. GMM mixture selection is quite important depending upon the size of dataset. As for concise vocabulary, use of triphone based acoustic modeling exhibit good results but for large size vocabulary, quinphones (quadraphones) based acoustic modeling is expected to give better performance. This paper presents an overview to use quinphones based acoustic modeling to reduce error rate.

Mohit Dua | Priyanka Sahu

[1] David Rybach,et al. Direct construction of compact context-dependency transducers from data , 2014, Comput. Speech Lang..

[2] Ivan Grech,et al. Comparative study of automatic speech recognition techniques , 2013, IET Signal Process..

[3] Douglas D. O'Shaughnessy,et al. Interacting with computers by voice: automatic speech recognition and synthesis , 2003, Proc. IEEE.

[4] Mohit Dua,et al. Continuous Hindi speech recognition using Gaussian mixture HMM , 2014, 2014 IEEE Students' Conference on Electrical, Electronics and Computer Science.

[5] Jun Cai,et al. Efficient likelihood evaluation and dynamic Gaussian selection for HMM-based speech recognition , 2009, Comput. Speech Lang..

[6] Mayank Dave,et al. Using Gaussian Mixtures for Hindi Speech Recognition System , 2011 .

[7] Claudio Becchetti,et al. Speech Recognition: Theory and C++ Implementation , 1999 .

[8] T. Hori,et al. Construction of weighted finite state transducers for very wide context-dependent acoustic models , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..