Vietnamese sentence recognition algorithm in embedded device based on specialized transition network

In this work, a proposed high speed continuous speech recognition algorithm which was designed towards embedded devices and experimented on Vietnamese will be presented. To be more specific, this algorithm has used a transition network (TN) as search-space, which integrates many language model systems and condensed by the algorithm format to both reduce processing time and memory usage while matching. The final results which were evaluated on 100 speech samples have achieved the high accuracy of 92.24% on an embedded device named WandBoard Rev C1 kit.

[1]  Bhuvana Ramabhadran,et al.  Speech recognition performance on a voicemail transcription task , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[3]  Amy Neustein Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics , 2010 .

[4]  Hermann Ney,et al.  Improvements in beam search for 10000-word continuous-speech recognition , 1994, IEEE Trans. Speech Audio Process..

[5]  Jing Huang,et al.  Automatic speech recognition performance on a voicemail transcription task , 2002, IEEE Trans. Speech Audio Process..

[6]  Robert D. Rodman,et al.  Computer Speech Technology , 1999 .

[7]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[8]  C. Jeyalakshmi,et al.  Deaf Speech Assessment Using Digital Processing Techniques , 2010 .

[9]  M. A. Anusuya,et al.  Speech Recognition by Machine, A Review , 2010, ArXiv.

[10]  N. Enfield AREAL LINGUISTICS AND MAINLAND SOUTHEAST ASIA , 2005 .

[11]  D.R. Reddy,et al.  Speech recognition by machine: A review , 1976, Proceedings of the IEEE.

[12]  Dafydd Gibbon,et al.  Spoken language system and corpus design , 1998 .

[13]  Seyed Ghorshi,et al.  Helping deaf and hard-of-hearing people by combining augmented reality and speech technologies , 2012 .

[14]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[15]  Laila Dybkjær,et al.  Spoken Multimodal Human-Computer Dialogue in Mobile Environments , 2005 .