Unknown Intent Detection Using Gaussian Mixture Model with an Application to Zero-shot Intent Classification

User intent classification plays a vital role in dialogue systems. Since user intent may frequently change over time in many realistic scenarios, unknown (new) intent detection has become an essential problem, where the study has just begun. This paper proposes a semantic-enhanced Gaussian mixture model (SEG) for unknown intent detection. In particular, we model utterance embeddings with a Gaussian mixture distribution and inject dynamic class semantic information into Gaussian means, which enables learning more class-concentrated embeddings that help to facilitate downstream outlier detection. Coupled with a density-based outlier detection algorithm, SEG achieves competitive results on three real task-oriented dialogue datasets in two languages for unknown intent detection. On top of that, we propose to integrate SEG as an unknown intent identifier into existing generalized zero-shot intent classification models to improve their performance. A case study on a state-of-the-art method, ReCapsNet, shows that SEG can push the classification performance to a significantly higher level.

[1]  James Henderson,et al.  A Model of Zero-Shot Learning of Spoken Language Understanding , 2015, EMNLP.

[2]  George R. Doddington,et al.  The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.

[3]  Ruhi Sarikaya,et al.  Convolutional neural network based triangular CRF for joint intent detection and slot filling , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[4]  Wanxiang Che,et al.  The First Evaluation of Chinese Human-Computer Dialogue Technology , 2017, ArXiv.

[5]  Qimai Li,et al.  Reconstructing Capsule Networks for Zero-shot Intent Classification , 2019, EMNLP.

[6]  Lei Shu,et al.  DOC: Deep Open Classification of Text Documents , 2017, EMNLP.

[7]  Gang Wang,et al.  Understanding user's query intent with wikipedia , 2009, WWW '09.

[8]  Jiansheng Chen,et al.  Rethinking Feature Distribution for Loss Functions in Image Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Hua Xu,et al.  Deep Unknown Intent Detection with Margin Loss , 2019, ACL.

[10]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[11]  Erik Cambria,et al.  Label Embedding for Zero-shot Fine-grained Named Entity Typing , 2016, COLING.

[12]  Johannes Fürnkranz,et al.  Using semantic similarity for multi-label zero-shot classification of text documents , 2016, ESANN.

[13]  Yike Guo,et al.  Integrating Semantic Knowledge to Tackle Zero-shot Text Classification , 2019, NAACL.

[14]  Fabrice Lefèvre,et al.  Online adaptative zero-shot learning spoken language understanding using word-embedding , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Johannes Fürnkranz,et al.  All-in Text: Learning Document, Label, and Word Representations Jointly , 2016, AAAI.

[16]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[17]  Andreas Stolcke,et al.  Recurrent neural network and LSTM models for lexical utterance classification , 2015, INTERSPEECH.

[18]  Fabrice Lefèvre,et al.  Zero-shot semantic parser for spoken language understanding , 2015, INTERSPEECH.

[19]  Andrew Y. Ng,et al.  Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[20]  Dilek Z. Hakkani-Tür,et al.  Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[22]  Wenpeng Yin,et al.  Comparative Study of CNN and RNN for Natural Language Processing , 2017, ArXiv.

[23]  Francesco Caltagirone,et al.  Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces , 2018, ArXiv.

[24]  Björn Hoffmeister,et al.  Zero-Shot Learning Across Heterogeneous Overlapping Domains , 2017, INTERSPEECH.

[25]  Philip S. Yu,et al.  Zero-shot User Intent Detection via Capsule Neural Networks , 2018, EMNLP.

[26]  Philip S. Yu,et al.  Mining User Intentions from Medical Queries: A Neural Network Based Heterogeneous Jointly Modeling Approach , 2016, WWW.

[27]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[28]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[29]  Bing Liu,et al.  Breaking the Closed World Assumption in Text Classification , 2016, NAACL.

[30]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[31]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.