论文信息 - Turkish Large Vocabulary Continuous Speech Recognition by using limited audio corpus

Turkish Large Vocabulary Continuous Speech Recognition by using limited audio corpus

In this paper, the recognition performances of several methodologies proposed in the context of Turkish Large Vocabulary Continuous Speech Recognition are retrieved by using a limited audio corpus. Word based, stem based, stem-ending based, and morph based language models are utilized with different n-gram orders. Word based and stem-ending based language models are extended by using several approaches. Also, a hybrid language model which is based on word based and stem-ending based language models is proposed. Word based language model is observed to outperform sub-word language models when limited audio corpus is used.

Adnan Yazici | Selçuk Köprü | Derya Susman

[1] E. Arisoy,et al. Language Modelling Approaches for Turkish Large Vocabulary Continuous Speech Recognition Based on Lattice Rescoring , 2006, 2006 IEEE 14th Signal Processing and Communications Applications.

[2] Mikko Kurimo,et al. Large vocabulary statistical language modeling for continuous speech recognition in finnish , 2001, INTERSPEECH.

[3] Mirjam Sepesy Maucec,et al. Large vocabulary continuous speech recognition of an inflected language using stems and endings , 2007, Speech Commun..

[4] Ebru Ar,et al. TURKISH DICTATION SYSTEM FOR RADIOLOGY AND BROADCAST NEWS APPLICATIONS , 2004 .