This article introduces a novel approach to model morphosyntax in morpheme unit based speech recognizers. The proposed method is evaluated in our recent Hungarian large vocabulary continuous speech recognition (LVCSR) system. The architecture of the recognition system is based on the weighted finite state transducer (WFST) paradigm. The task domain is the recognition of fluently read sentences selected from a major daily newspaper. The vocabulary units used in the system are morpheme based in order to provide sufficient coverage of the large number of word-forms resulting from affixation and compounding. Besides the standard morpheme N-gram language model we evaluate the novel stochastic morphosyntactic language model (SMLM) that describes the valid word-forms (morpheme combinations) of the language. Thanks to the flexible transducer-based architecture of the system the morphosyntactic component is integrated seamlessly with the basic modules with no need to modify the decoder itself. The proposed stochastic morphosyntactic language model decreases the error rate by 17.9% relatively compared to the baseline trigram system. The morpheme error rate of the best configuration is 14.75% in a 1350 morpheme Hungarian dictation task.
[1]
Máté Szarvas,et al.
Improving Phoneme Classification Performance Using Observation Context–Dependent Segment Models
,
2000,
Int. J. Speech Technol..
[2]
Mehryar Mohri,et al.
Finite-State Transducers in Language and Speech Processing
,
1997,
CL.
[3]
Sadaoki Furui,et al.
THE USE OF FINITE-STATE TRANSDUCERS FOR MODELING PHONOLOGICAL AND MORPHOLOGICAL CONSTRAINTS IN AUTOMATIC SPEECH RECOGNITION
,
2001
.
[4]
Katsuhiko Shirai,et al.
Japanese large-vocabulary continuous-speech recognition using a newspaper corpus and broadcast news
,
1999,
Speech Commun..
[5]
Máté Szarvas,et al.
Automatic Recognition of Hungarian: Theory And Practice
,
2000,
Int. J. Speech Technol..
[6]
Fernando Pereira,et al.
Weighted finite-state transducers in speech recognition
,
2002,
Comput. Speech Lang..
[7]
John B. Shoven,et al.
I
,
Edinburgh Medical and Surgical Journal.
[8]
Sadaoki Furui,et al.
Finite-state transducer based hungarian LVCSR with explicit modeling of phonological changes
,
2002,
INTERSPEECH.