ONENET: Joint domain, intent, slot prediction for spoken language understanding

In practice, most spoken language understanding systems process user input in a pipelined manner; first domain is predicted, then intent and semantic slots are inferred according to the semantic frames of the predicted domain. The pipeline approach, however, has some disadvantages: error propagation and lack of information sharing. To address these issues, we present a unified neural network that jointly performs domain, intent, and slot predictions. Our approach adopts a principled architecture for multitask learning to fold in the state-of-the-art models for each task. With a few more ingredients, e.g. orthography-sensitive input encoding and curriculum training, our model delivered significant improvements in all three tasks across all domains over strong baselines, including one using oracle prediction for domain detection, on real user data of a commercial personal assistant.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Young-Bum Kim,et al.  Natural Language Model Re-usability for Scaling to Different Domains , 2016, EMNLP.

[3]  Young-Bum Kim,et al.  Domain Attention with an Ensemble of Experts , 2017, ACL.

[4]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[5]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[6]  Gökhan Tür,et al.  Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM , 2016, INTERSPEECH.

[7]  Young-Bum Kim,et al.  Frustratingly Easy Neural Domain Adaptation , 2016, COLING.

[8]  Young-Bum Kim,et al.  Compact Lexicon Selection with Spectral Methods , 2015, ACL.

[9]  Young-Bum Kim,et al.  Adversarial Adaptation of Synthetic or Stale Data , 2017, ACL.

[10]  Gary Geunbae Lee,et al.  Triangular-Chain Conditional Random Fields , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Young-Bum Kim,et al.  Domainless Adaptation by Constrained Decoding on a Schema Lattice , 2016, COLING.

[12]  Young-Bum Kim,et al.  Task specific continuous word representations for mono and multi-lingual spoken language understanding , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Larry P. Heck,et al.  Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding , 2016, INTERSPEECH.

[14]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[15]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[16]  Young-Bum Kim,et al.  Pre-training of Hidden-Unit CRFs , 2015, ACL.

[17]  Young-Bum Kim,et al.  New Transfer Learning Techniques for Disparate Label Sets , 2015, ACL.

[18]  Ruhi Sarikaya,et al.  Convolutional neural network based triangular CRF for joint intent detection and slot filling , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[19]  Young-Bum Kim,et al.  Scalable Semi-Supervised Query Classification Using Matrix Sketching , 2016, ACL.

[20]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[21]  Bing Liu,et al.  Joint Online Spoken Language Understanding and Language Modeling With Recurrent Neural Networks , 2016, SIGDIAL Conference.

[22]  Young-Bum Kim,et al.  Weakly Supervised Slot Tagging with Partially Labeled Sequences from Web Search Click Logs , 2015, NAACL.

[23]  Kevin Duh,et al.  DyNet: The Dynamic Neural Network Toolkit , 2017, ArXiv.

[24]  Houfeng Wang,et al.  A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding , 2016, IJCAI.

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[27]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[28]  Geoffrey Zweig,et al.  Joint semantic utterance classification and slot filling with recursive neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[29]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[30]  Young-Bum Kim,et al.  A Framework for pre-training hidden-unit conditional random fields and its extension to long short term memory networks , 2017, Comput. Speech Lang..

[31]  Young-Bum Kim,et al.  An overview of end-to-end language understanding and dialog management for personal digital assistants , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).