Investigation of morph-based speech recognition improvements across speech genres

The improvement achieved by changing the basis of speech recognition from words to morphs (various sub-word units) varies greatly across tasks and languages. We make an attempt to explore the source of this variability by the investigation of three LVCSR tasks corresponding to three speech genres of a highly agglutinative language. Novel, press conference and broadcast news transcription results are presented and compared to spontaneous speech recognition results in several experimental setups. A noticeable correlation is observed between an easily computable characteristic of various language speech recognition tasks and between the relative improvements due to (statistical) morph-based approaches.

[1]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[2]  András Kornai,et al.  Hunmorph: Open Source Word Analysis , 2005, ACL 2005.

[3]  Ebru Arisoy,et al.  Language modeling for automatic turkish broadcast news transcription , 2007, INTERSPEECH.

[4]  Mathias Creutz,et al.  Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Morfessor 1.0 , 2005 .

[5]  Máté Szarvas,et al.  Automatic Recognition of Hungarian: Theory And Practice , 2000, Int. J. Speech Technol..

[6]  Fernando Pereira,et al.  Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[7]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[8]  Oh-Wook Kwon,et al.  Korean large vocabulary continuous speech recognition with morpheme-based recognition units , 2003, Speech Commun..

[9]  M. Kurimo,et al.  Decoder issues in unlimited finnish speech recognition , 2004, Proceedings of the 6th Nordic Signal Processing Symposium, 2004. NORSIG 2004..

[10]  Tibor Fegyó,et al.  Towards Automatic Transcription of Large Spoken Archives in Agglutinating Languages - Hungarian ASR for the MALACH Project , 2007, TSD.

[11]  Andreas Stolcke,et al.  Entropy-based Pruning of Backoff Language Models , 2000, ArXiv.

[12]  Tibor Fegyó,et al.  A morpho-graphemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian , 2007, INTERSPEECH.

[13]  Mathias Creutz,et al.  INDUCING THE MORPHOLOGICAL LEXICON OF A NATURAL LANGUAGE FROM UNANNOTATED TEXT , 2005 .

[14]  Ebru Arisoy,et al.  Unlimited vocabulary speech recognition for agglutinative languages , 2006, NAACL.

[15]  Laurent Mauuary,et al.  Blind equalization in the cepstral domain for robust telephone based speech recognition , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[16]  Ebru Arisoy,et al.  Morph-based speech recognition and modeling of out-of-vocabulary words across languages , 2007, TSLP.