Zero-shot learning of user intent understanding by Convolutional Neural Networks

User intent is a goal that underlies a user-generated utterance which plays a critical role in many intelligent applications, such as dialog systems and search engines. Most previous works focus on intent understanding as a supervised classification problem with the hypothesis that the utterances are labeled in predefined intents. However, how to detect emerging user intents where no labeled utterances are currently tentative. In this paper, we present a zero-shot learning approach for intent understanding, it can predict intents at runtime that did not exist at training time. Our approach extracts semantic features from exiting intents and emerging intents respectively by the Convolutional Neural Networks, and then discriminates emerging intents via knowledge transfer from existing intents. Experiments on a real world dataset show that our model performs better to discriminate emerging intents when no labeled utterances are available.

[1]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[2]  Gökhan Tür,et al.  Deriving local relational surface forms from dependency-based entity embeddings for unsupervised spoken language understanding , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[3]  Will Serrano,et al.  Intelligent search with deep learning clusters , 2017, 2017 Intelligent Systems Conference (IntelliSys).

[4]  Gökhan Tür,et al.  Extending domain coverage of language understanding systems via intent transfer between domains using knowledge graphs and search query click logs , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Ruhi Sarikaya,et al.  Convolutional neural network based triangular CRF for joint intent detection and slot filling , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[6]  Björn Hoffmeister,et al.  Zero-Shot Learning Across Heterogeneous Overlapping Domains , 2017, INTERSPEECH.

[7]  James R. Glass,et al.  Data collection and language understanding of food descriptions , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[8]  Philip S. Yu,et al.  Zero-shot User Intent Detection via Capsule Neural Networks , 2018, EMNLP.

[9]  Philip S. Yu,et al.  Mining User Intentions from Medical Queries: A Neural Network Based Heterogeneous Jointly Modeling Approach , 2016, WWW.

[10]  Young-Bum Kim,et al.  New Transfer Learning Techniques for Disparate Label Sets , 2015, ACL.

[11]  Dilek Z. Hakkani-Tür,et al.  Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Andrew Y. Ng,et al.  Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[13]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[14]  Ailing Zhang Simultaneous Interpreting (SI): the Holy Grail of Artificial Intelligence – An SI Practitioner’s Perspective , 2017 .

[15]  Alexander I. Rudnicky,et al.  Unsupervised induction and filling of semantic slots for spoken dialogue systems using frame-semantic parsing , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.