Out of vocabulary detection in Indonesian speech recognition using word and syllable level decoding

One of the problems in speech recognition is out of vocabulary words (OOV) because they can make some words error. Out of vocabulary words are the words that cannot be recognized by speech recognizer because there is no recognizing database. Alignment, language model, and POS Tag method is proposed in order to recognize word error because of OOV words. Word and syllable level decoding from speech recognizer is the input for this method. Alignment is applied to word and syllable level decoding to get some differences from word and syllable level decoding. After that, language model and tag are also applied to determine if the words are correct. Speech recognition accuracy is about 75% if OOV rate is 15,5%. The OOV detection process reaches about 87% precision and 75% recall. Experiments also show that by using OOV detection, speech recognizer accuracy is increased by 11%.

[1]  张国亮,et al.  Comparison of Different Implementations of MFCC , 2001 .

[2]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[3]  Zheng Fang,et al.  Comparison of different implementations of MFCC , 2001 .

[4]  Ayu Purwarianti,et al.  HMM Based Part-of-Speech Tagger f or Bahasa Indonesia , 2010 .

[5]  Matteo Barigozzi,et al.  MULTIPLE STRING ALIGNMENT , 2004 .

[6]  James Glass,et al.  Modelling out-of-vocabulary words for robust speech recognition , 2002 .

[7]  Dessi Puji Lestari,et al.  A Large Vocabulary Continuous Speech Recognition System for Indonesian Language , 2006 .

[8]  Geoffrey Zweig,et al.  Empirical properties of multilingual phone-to-word transduction , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Satoshi Nakamura,et al.  Development of Indonesian Large Vocabulary Continuous Speech Recognition System within A-STAR Project , 2008, IJCNLP.

[10]  W. Bastiaan Kleijn,et al.  Auditory model based optimization of MFCCs improves automatic speech recognition performance , 2009, INTERSPEECH.

[11]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[12]  Lucian Galescu Recognition of out-of-vocabulary words with sub-lexical language models , 2003, INTERSPEECH.

[13]  Hermann Ney,et al.  Open vocabulary speech recognition with flat hybrid models , 2005, INTERSPEECH.

[14]  Tatsuya Kawahara,et al.  Recent Development of Open-Source Speech Recognition Engine Julius , 2009 .

[15]  Lawrence R. Rabiner,et al.  A tutorial on Hidden Markov Models , 1986 .