Recent advances in the statistical modeling of the Slovak language

In this paper we aim to describe recent advances in the statistical modeling of the Slovak language for transcription of dictated, semi-spontaneous and spontaneous conversational speech such as judicial readings, broadcast news TV and radio shows, parliament proceedings, educational talks and lectures, or interactive conversations. During the last months, we have improved the efficiency and robustness of the Slovak language models trained on the electronic and web-based language resources, including better text processing and document classification, class-based and filled pauses modeling, augmenting of n-grams and fast language model adaptation. Experimental results performed on the judicial readings, broadcast news recordings and parliament proceeding show significant decrease of the word error rate for multiple Slovak transcription system configurations of acoustic and language models in presented scenarios.

[1]  Ciro Martins,et al.  Dynamic language modeling for European Portuguese , 2010, Comput. Speech Lang..

[2]  Mitch Weintraub,et al.  Explicit word error minimization in n-best list rescoring , 1997, EUROSPEECH.

[3]  Jonathan G. Fiscus,et al.  A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[4]  Jean-Luc Gauvain,et al.  Dynamic language modeling for broadcast news , 2004, INTERSPEECH.

[5]  Martine Adda-Decker,et al.  The 300k LIMSI German broadcast news transcription system , 2003, INTERSPEECH.

[6]  Andreas Stolcke,et al.  SRILM at Sixteen: Update and Outlook , 2011 .

[7]  Marián Trnka,et al.  Advances in the Slovak Judicial Domain Dictation System , 2013, LTC.

[8]  Kiyohiro Shikano,et al.  Julius - an open source real-time large vocabulary recognition engine , 2001, INTERSPEECH.

[9]  Jan Nouza,et al.  Fully automated system for Czech spoken broadcast transcription with very large (300k+) lexicon , 2005, INTERSPEECH.

[10]  Alexandre Allauzen,et al.  Where are we in transcribing French broadcast news? , 2005, INTERSPEECH.

[11]  Jean-Luc Gauvain,et al.  The LIMSI RT07 Lecture Transcription System , 2007, CLEAR.

[12]  Jozef Juhár,et al.  The Slovak Categorized News Corpus , 2014, LREC.

[13]  Milos Cernak,et al.  Effective Triphone Mapping for Acoustic Modeling in Speech Recognition , 2011, INTERSPEECH.

[14]  Jozef Juhár,et al.  Classification of heterogeneous text data for robust domain-specific language modeling , 2014, EURASIP J. Audio Speech Music. Process..