Streamlined Decoder for Chinese Spoken Language Understanding

As a critical component of Spoken Dialog System (SDS), spoken language understanding (SLU) attracts a lot of attention, especially for methods based on unaligned data. Recently, a new approach has been proposed that utilizes the hierarchical relationship between act-slot-value triples. However, it ignores the transfer of internal information which may record the intermediate information of the upper level and contribute to the prediction of the lower level. So, we propose a novel streamlined decoding structure with attention mechanism, which uses three successively connected RNN to decode act, slot and value respectively. On the first Chinese Audio-Textual Spoken Language Understanding Challenge (CATSLU), our model exceeds state-of-the-art model on an unaligned multi-turn task-oriented Chinese spoken dialogue dataset provided by the contest.

[1]  Kai Yu,et al.  Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[3]  Stefan Ultes,et al.  Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding , 2016, COLING.

[4]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[5]  Ruhi Sarikaya,et al.  Deep contextual language understanding in spoken dialogue systems , 2015, INTERSPEECH.

[6]  Matthew Henderson,et al.  Discriminative spoken language understanding using word confusion networks , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[7]  Kai Yu,et al.  A Hierarchical Decoding Model for Spoken Language Understanding from Unaligned Data , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Geoffrey Zweig,et al.  Recurrent neural networks for language understanding , 2013, INTERSPEECH.

[9]  Giuseppe Riccardi,et al.  Generative and discriminative algorithms for spoken language understanding , 2007, INTERSPEECH.

[10]  Tiejun Zhao,et al.  CATSLU: The 1st Chinese Audio-Textual Spoken Language Understanding Challenge , 2019, ICMI.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[13]  Lin Zhao,et al.  Improving Slot Filling in Spoken Language Understanding with Joint Pointer and Attention , 2018, ACL.