Combination of finite state automata and neural network for spoken language understanding

Abstract This paper proposes a novel approach for spoken languageunderstanding based on a combination of weighted finite stateautomata and an artificial neural network. The former machineacts as a robust parser, which extracts some semanticinformation called subframes from an input sentence, then thelatter machine interprets a concept of the sentence byconsidering the existence of subframes and their scoresobtained from the automata. With a large number of conceptshandled in our mixed-initiative dialogue system, the proposedsystem achieves a considerable concept interpretation result oneither a typed-in test set or a spoken test set. A high subframerecall rate also verifies an applicability of the proposedsystem. 1. Introduction A pioneering mixed-initiative spoken dialogue system withThai language interaction has been constructed in a domain ofhotel reservation [1]. Lack of resources for the new language,especially annotated corpora, has caused difficulty in theinvention. Consequently except for the speech recognitionengine, most parts of our first system were constructed basedon handcrafted rules. The preliminary evaluation showed thata considerable size of errors came from high out-of-vocabulary(OOV) and out-of-concept (OOC) rates in speech recognitionand language understanding respectively. Reducing the OOVcan be easily achieved by adding recognizer lexicon entries.However, extension of concepts that the system can handleneeds expansion of linguistic knowledge and a complicatedannotated corpus.Instead of improving the handcrafted rule-basedunderstanding system implemented in the first version, wehave recently tried to create a new system that can beautomatically trained by a given corpus. Many researchprojects have split the spoken language understanding taskinto two consecutive subsystems namely speech recognitionand understanding. Since the speech recognizer is not themain focus of this article, its task is equivalent to finding themost likely meaning

[1]  Richard M. Schwartz,et al.  Hidden Understanding Models of Natural Language , 1994, ACL.

[2]  Brendan J. Frey,et al.  Combination of statistical and rule-based approaches for spoken language understanding , 2002, INTERSPEECH.

[3]  Sadaoki Furui,et al.  Pioneering a Thai Language Spoken Dialogue System , 2003 .

[4]  Giuseppe Riccardi,et al.  How may I help you? , 1997, Speech Commun..

[5]  Tatsuya Kawahara,et al.  Domain-independent spoken dialogue platform using key-phrase spotting based on combined language model , 2001, INTERSPEECH.

[6]  Wolfgang Minker,et al.  A stochastic case frame approach for natural language understanding , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  H. Bonneau-Maynard,et al.  Investigating stochastic speech understanding , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..