论文信息 - Morpholexical and Discriminative Language Models for Turkish Automatic Speech Recognition

Morpholexical and Discriminative Language Models for Turkish Automatic Speech Recognition

This paper introduces two complementary language modeling approaches for morphologically rich languages aiming to alleviate out-of-vocabulary (OOV) word problem and to exploit morphology as a knowledge source. The first model, morpholexical language model, is a generative $n$-gram model, where modeling units are lexical-grammatical morphemes instead of commonly used words or statistical sub-words. This paper also proposes a novel approach for integrating the morphology into an automatic speech recognition (ASR) system in the finite-state transducer framework as a knowledge source. We accomplish that by building a morpholexical search network obtained by the composition of lexical transducer of a computational lexicon with a morpholexical language model. The second model is a linear reranking model trained discriminatively with a variant of the perceptron algorithm using morpholexical features. This variant of the perceptron algorithm, WER-sensitive perceptron, is shown to perform better for reranking $n$ -best candidates obtained with the generative model. We apply the proposed models in Turkish broadcast news transcription task and give experimental results. The morpholexical model leads to an elegant morphology-integrated search network with unlimited vocabulary. Thus, it is highly effective in alleviating OOV problem and improves the word error rate (WER) over word and statistical sub-word models by 1.8% and 0.4% absolute, respectively. The discriminatively trained morpholexical model further improves the WER of the system by 0.8% absolute.

Murat Saraclar | Tunga Güngör | Hasim Sak

[1] Mehryar Mohri,et al. Finite-State Transducers in Language and Speech Processing , 1997, CL.

[2] Ebru Arisoy,et al. Lattice Extension and Vocabulary Adaptation for Turkish LVCSR , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[3] Murat Saraclar,et al. Integrating morphology into automatic speech recognition , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[4] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[5] Brian Roark,et al. Generalized Algorithms for Constructing Statistical Language Models , 2003, ACL.

[6] Brian Roark,et al. The design principles and algorithms of a weighted grammar library , 2005, Int. J. Found. Comput. Sci..

[7] Mikko Kurimo,et al. Unlimited vocabulary speech recognition with morph language models applied to Finnish , 2006, Comput. Speech Lang..

[8] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[9] Michael Collins,et al. Trigger-Based Language Modeling using a Loss-Sensitive Perceptron Algorithm , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[10] Petra Geutner,et al. Using morphology towards better large-vocabulary speech recognition systems , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[11] Andreas Stolcke,et al. Morphology-based language modeling for conversational Arabic speech recognition , 2006, Comput. Speech Lang..

[12] Brian Roark,et al. A generalized construction of integrated speech recognition transducers , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13] Thorsten Brants,et al. Study on interaction between entropy pruning and kneser-ney smoothing , 2010, INTERSPEECH.

[14] Michael Collins,et al. New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.

[15] Kemal Oflazer,et al. Two-level Description of Turkish Morphology , 1993, EACL.

[16] Kimmo Koskenniemi,et al. A General Computational Model for Word-Form Recognition and Production , 1984 .

[17] Tibor Fegyó,et al. A morpho-graphemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian , 2007, INTERSPEECH.

[18] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[19] Mathias Creutz,et al. Unsupervised Discovery of Morphemes , 2002, SIGMORPHON.

[20] William J. Byrne,et al. On large vocabulary continuous speech recognition of highly inflectional language - czech , 2001, INTERSPEECH.

[21] Murat Saraclar,et al. Morphology-based and sub-word language modeling for Turkish speech recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22] K. Oflazer,et al. Incorporating language constraints in sub-word based speech recognition , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[23] Lauri Karttunen,et al. Two-level rule compiler , 1992 .

[24] Oh-Wook Kwon,et al. Korean large vocabulary continuous speech recognition with morpheme-based recognition units , 2003, Speech Commun..

[25] Murat Saraclar,et al. Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus , 2008, GoTAL.

[26] Murat Saraclar,et al. Discriminative reranking of ASR hypotheses with morpholexical and N-best-list features , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[27] Murat Saraclar,et al. Morphological Disambiguation of Turkish Text with Perceptron Algorithm , 2009, CICLing.

[28] Kemal Oflazer,et al. The architecture and the implementation of a finite state pronunciation lexicon for Turkish , 2006, Comput. Speech Lang..

[29] Mehryar Mohri,et al. Integrated context-dependent networks in very large vocabulary speech recognition , 1999, EUROSPEECH.

[30] Murat Saraclar,et al. Resources for Turkish morphological processing , 2011, Lang. Resour. Evaluation.

[31] Ruhi Sarikaya,et al. Joint Morphological-Lexical Language Modeling for Processing Morphologically Rich Languages With Application to Dialectal Arabic , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[32] Yoav Freund,et al. Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[33] Ebru Arisoy,et al. Discriminative Language Modeling With Linguistic and Statistically Derived Features , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[34] Ebru Arisoy,et al. Language modeling for automatic turkish broadcast news transcription , 2007, INTERSPEECH.

[35] Murat Saraclar,et al. On-the-fly lattice rescoring for real-time automatic speech recognition , 2010, INTERSPEECH.

[36] Ebru Arisoy,et al. Turkish Broadcast News Transcription and Retrieval , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[37] Brian Roark,et al. Discriminative n-gram language modeling , 2007, Comput. Speech Lang..

[38] Fernando Pereira,et al. Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[39] Aravind K. Joshi,et al. Ranking and Reranking with Perceptron , 2005, Machine Learning.

[40] Johan Schalkwyk,et al. OpenFst: A General and Efficient Weighted Finite-State Transducer Library , 2007, CIAA.

[41] Kimmo Koskenniemi,et al. A General Computational Model for Word-Form Recognition and Production , 1984, ACL.

[42] Mirjam Sepesy Maucec,et al. Large vocabulary continuous speech recognition of an inflected language using stems and endings , 2007, Speech Commun..