Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training

Out-of-scope intent detection is of practical importance in task-oriented dialogue systems. Since the distribution of outlier utterances is arbitrary and unknown in the training stage, existing methods commonly rely on strong assumptions on data distribution such as mixture of Gaussians to make inference, resulting in either complex multi-step training procedures or hand-crafted rules such as confidence threshold selection for outlier detection. In this paper, we propose a simple yet effective method to train an out-of-scope intent classifier in a fully end-to-end manner by simulating the test scenario in training, which requires no assumption on data distribution and no additional postprocessing or threshold setting. Specifically, we construct a set of pseudo outliers in the training stage, by generating synthetic outliers using inliner features via self-supervision and sampling out-of-scope sentences from easily available open-domain datasets. The pseudo outliers are used to train a discriminative classifier that can be directly applied to and generalize well on the test task. We evaluate our method extensively on four benchmark dialogue datasets and observe significant improvements over state-of-the-art approaches. Our code has been released at https:// github.com/liam0949/DCLOOS.

[1]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[2]  Gary Geunbae Lee,et al.  Neural sentence embedding using only in-domain sentences for out-of-domain sentence detection in dialog systems , 2017, Pattern Recognit. Lett..

[3]  Kibok Lee,et al.  Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[4]  Yang Yu,et al.  Out-of-Domain Detection for Low-Resource Text Classification Tasks , 2019, EMNLP.

[5]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[6]  Jiawei Han,et al.  Weakly-Supervised Neural Text Classification , 2018, CIKM.

[7]  Gopinath Chennupati,et al.  On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks , 2019, NeurIPS.

[8]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[9]  Lei Shu,et al.  DOC: Deep Open Classification of Text Documents , 2017, EMNLP.

[10]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[11]  Lingjia Tang,et al.  An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction , 2019, EMNLP.

[12]  Thomas G. Dietterich,et al.  Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[13]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[14]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[15]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[16]  Peng Wang,et al.  Short Text Clustering via Convolutional Neural Networks , 2015, VS@HLT-NAACL.

[17]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[18]  Hua Xu,et al.  Deep Open Intent Classification with Adaptive Decision Boundary , 2020, ArXiv.

[19]  Philip S. Yu,et al.  Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference , 2020, EMNLP.

[20]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[21]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[22]  A. Emin Orhan,et al.  Robustness properties of Facebook's ResNeXt WSL models , 2019, ArXiv.

[23]  D. Song,et al.  The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Gary Geunbae Lee,et al.  Out-of-domain Detection based on Generative Adversarial Network , 2018, EMNLP.

[25]  Hua Xu,et al.  Deep Unknown Intent Detection with Margin Loss , 2019, ACL.

[26]  Thomas G. Dietterich,et al.  A Unifying Review of Deep and Shallow Anomaly Detection , 2020, Proceedings of the IEEE.

[27]  Dawn Song,et al.  Pretrained Transformers Improve Out-of-Distribution Robustness , 2020, ACL.

[28]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[29]  Qimai Li,et al.  Unknown Intent Detection Using Gaussian Mixture Model with an Application to Zero-shot Intent Classification , 2020, ACL.

[30]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[31]  Matthew Henderson,et al.  Efficient Intent Detection with Dual Sentence Encoders , 2020, NLP4CONVAI.

[32]  Kimin Lee,et al.  Using Pre-Training Can Improve Model Robustness and Uncertainty , 2019, ICML.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[35]  Lorena Sainz-Maza Lecanda,et al.  Cross-lingual Transfer Learning for Intent Detection of Covid-19 Utterances , 2020 .

[36]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[37]  Hans-Peter Kriegel,et al.  A survey on unsupervised outlier detection in high‐dimensional numerical data , 2012, Stat. Anal. Data Min..

[38]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.