论文信息 - Deep Cascade Multi-task Learning for Slot Filling in Chinese E-commerce Shopping Guide Assistant

Deep Cascade Multi-task Learning for Slot Filling in Chinese E-commerce Shopping Guide Assistant

Slot filling is a critical task in natural language understanding (NLU) for dialog systems. State-of-the-art solutions regard it as a sequence label- ing task and adopt BiLSTM-CRF models. While BiLSTM-CRF models works relatively well on standard datasets it faces challenges in Chinese E-commerce slot filling due to more informative slot labels and richer expressions. In this paper, we propose a deep multi-task learning model with cascade and residual connections. Experimental results show that our framework not only achieves competitive performance with state-of-the-arts on a standard dataset, but also significantly outperforms strong baselines by a substantial gain of 14.6% on a Chinese E-commerce dataset.

Yu Gong | Xi Chen | Kenny Q. Zhu | Wenwu Ou | Xusheng Luo

[1] Nanyun Peng,et al. Multi-task Multi-domain Representation Learning for Sequence Tagging , 2016, ArXiv.

[2] Ngoc Thang Vu,et al. Bi-directional recurrent neural network with ranking loss for spoken language understanding , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3] C A Nelson,et al. Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.

[4] Bing Liu,et al. Joint Online Spoken Language Understanding and Language Modeling With Recurrent Neural Networks , 2016, SIGDIAL Conference.

[5] Bing Liu,et al. Recurrent Neural Network Structured Output Prediction for Spoken Language Understanding , 2015 .

[6] Ngoc Thang Vu. Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding , 2016, INTERSPEECH.

[7] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Bowen Zhou,et al. Neural Models for Sequence Chunking , 2017, AAAI.

[9] Gökhan Tür,et al. Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM , 2016, INTERSPEECH.

[10] Geoffrey Zweig,et al. Spoken language understanding using long short-term memory neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[11] P. J. Price,et al. Evaluation of Spoken Language Systems: the ATIS Domain , 1990, HLT.

[12] Wei Xu,et al. Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[13] Ruhi Sarikaya,et al. Convolutional neural network based triangular CRF for joint intent detection and slot filling , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[14] Kai Yu,et al. Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15] Yuji Matsumoto,et al. Chunking with Support Vector Machines , 2001, NAACL.

[16] Giuseppe Riccardi,et al. Generative and discriminative algorithms for spoken language understanding , 2007, INTERSPEECH.

[17] Bowen Zhou,et al. Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling , 2016, EMNLP.

[18] Kaisheng Yao,et al. Recurrent Neural Networks with External Memory for Language Understanding , 2015, ArXiv.

[19] Steve Young,et al. A data-driven spoken language understanding system , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[20] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[21] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[22] PROCEssIng magazInE. IEEE Signal Processing Magazine , 2004 .

[23] Houfeng Wang,et al. A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding , 2016, IJCAI.

[24] Geoffrey Zweig,et al. Recurrent neural networks for language understanding , 2013, INTERSPEECH.

[25] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Iryna Gurevych,et al. Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks , 2017, ArXiv.

[27] Ruslan Salakhutdinov,et al. Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks , 2016, ICLR.