论文信息 - GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation - 字舞流文

GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation

Practical dialogue systems require robust methods of detecting out-of-scope (OOS) utterances to avoid conversational breakdowns and related failure modes. Directly training a model with labeled OOS examples yields reasonable performance, but obtaining such data is a resource-intensive process. To tackle this limited-data problem, previous methods focus on better modeling the distribution of in-scope (INS) examples. We introduce GOLD as an orthogonal technique that augments existing data to train better OOS detectors operating in low-data regimes. GOLD generates pseudo-labeled candidates using samples from an auxiliary dataset and keeps only the most beneficial candidates for training through a novel filtering mechanism. In experiments across three target benchmarks, the top GOLD model outperforms all existing methods on all key metrics, achieving relative gains of 52.4%, 48.9% and 50.3% against median baseline performance. We also analyze the unique properties of OOS data to identify key factors for optimally applying our proposed method.1

Zhou Yu | Derek Chen | Zhou Yu | Derek Chen

[1] Zhangyang Wang,et al. Self-Supervised Learning for Generalizable Out-of-Distribution Detection , 2020, AAAI.

[2] Soroush Vosoughi,et al. Data Boost: Text Data Augmentation through Reinforcement Learning Guided Conditional Generation , 2020, EMNLP.

[3] Lingjia Tang,et al. An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction , 2019, EMNLP.

[4] Gary Geunbae Lee,et al. Neural sentence embedding using only in-domain sentences for out-of-domain sentence detection in dialog systems , 2017, Pattern Recognit. Lett..

[5] Ho-Jin Choi,et al. Out-of-Domain Detection Method Based on Sentence Distance for Dialogue Systems , 2018, 2018 IEEE International Conference on Big Data and Smart Computing (BigComp).

[6] Irina Piontkovskaya,et al. Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection , 2021, ArXiv.

[7] Bing Liu,et al. Breaking the Closed World Assumption in Text Classification , 2016, NAACL.

[8] Yijia Liu,et al. Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding , 2018, COLING.

[9] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[10] Xia Zhu,et al. Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-out Classifiers , 2018, ECCV.

[11] Marzyeh Ghassemi,et al. Improving Dialogue Breakdown Detection with Semi-Supervised Learning , 2020, ArXiv.

[12] Young-Bum Kim,et al. Joint Learning of Domain Classification and Out-of-Domain Detection with Dynamic Class Weighting for Satisficing False Acceptance Rates , 2018, INTERSPEECH.

[13] Kai Zou,et al. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[14] Charles Blundell,et al. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[15] Shikib Mehri,et al. STAR: A Schema-Guided Dialog Dataset for Transfer Learning , 2020, ArXiv.

[16] Mark Goadrich,et al. The relationship between Precision-Recall and ROC curves , 2006, ICML.

[17] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18] Thomas G. Dietterich,et al. Deep Anomaly Detection with Outlier Exposure , 2018, ICLR.

[19] Sebastian Schuster,et al. Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog , 2018, NAACL.

[20] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[21] Kibok Lee,et al. Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples , 2017, ICLR.

[22] Daphna Weinshall,et al. Distance-based Confidence Score for Neural Network Classifiers , 2017, ArXiv.

[23] Percy Liang,et al. Selective Question Answering under Domain Shift , 2020, ACL.

[24] Fabio Roli,et al. Classification with reject option in text categorisation systems , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[25] R. Srikant,et al. Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[26] Yang Yu,et al. Out-of-Domain Detection for Low-Resource Text Classification Tasks , 2019, EMNLP.

[27] Percy Liang,et al. Data Recombination for Neural Semantic Parsing , 2016, ACL.

[28] Diyi Yang,et al. That’s So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets , 2015, EMNLP.

[29] Asma Ben Abacha,et al. A question-entailment approach to question answering , 2019, BMC Bioinformatics.

[30] Jason Baldridge,et al. PAWS: Paraphrase Adversaries from Word Scrambling , 2019, NAACL.

[31] Jason Weston,et al. Neural Text Generation with Unlikelihood Training , 2019, ICLR.

[32] Kyunghyun Cho,et al. SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness , 2020, EMNLP.

[33] Eyup Halit Yilmaz,et al. KLOOS: KL Divergence-based Out-of-Scope Intent Detection in Human-to-Machine Conversations , 2020, SIGIR.

[34] Lei Shu,et al. DOC: Deep Open Classification of Text Documents , 2017, EMNLP.

[35] Vincent Auvray,et al. OodGAN: Generative Adversarial Network for Out-of-Domain Data Generation , 2021, NAACL.

[36] Jason Weston,et al. Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[37] Xing Wu,et al. Conditional BERT Contextual Augmentation , 2018, ICCS.

[38] M. Rey. The Error Is the Clue: Breakdown In Human-Machine Interaction , 2003 .

[39] Alessandro Rinaldo,et al. Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection , 2019, NeurIPS.

[40] Ling Liu,et al. Data Augmentation for Morphological Reinflection , 2017, CoNLL.

[41] Bill Byrne,et al. Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset , 2019, EMNLP.

[42] John Langford,et al. Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds , 2019, ICLR.

[43] Sungjin Lee,et al. Contextual Out-of-domain Utterance Handling with Counterfeit Data Augmentation , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[44] Jasper Snoek,et al. Likelihood Ratios for Out-of-Distribution Detection , 2019, NeurIPS.

[45] Sosuke Kobayashi,et al. Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations , 2018, NAACL.

[46] Hans-Peter Kriegel,et al. LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[47] Yuka Kobayashi,et al. The dialogue breakdown detection challenge: Task description, datasets, and evaluation metrics , 2016, LREC.

[48] Christopher Potts,et al. Posing Fair Generalization Tasks for Natural Language Inference , 2019, EMNLP.

[49] Gary Geunbae Lee,et al. Out-of-domain Detection based on Generative Adversarial Network , 2018, EMNLP.

[50] Hua Xu,et al. Deep Unknown Intent Detection with Margin Loss , 2019, ACL.

[51] Christof Monz,et al. Data Augmentation for Low-Resource Neural Machine Translation , 2017, ACL.

[52] Arash Einolghozati,et al. Likelihood Ratios and Generative Classifiers for Unsupervised Out-of-Domain Detection In Task Oriented Dialog , 2019, AAAI.

[53] Ana Paula Appel,et al. Improving Out-of-Scope Detection in Intent Classification by Using Embeddings of the Word Graph Space of the Classes , 2020, EMNLP.

[54] Jacob Andreas,et al. Task-Oriented Dialogue as Dataflow Synthesis , 2020, Transactions of the Association for Computational Linguistics.

[55] William A. Gale,et al. A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[56] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[57] Mohit Bansal,et al. Automatically Learning Data Augmentation Policies for Dialogue Tasks , 2019, EMNLP.

[58] Minlie Huang,et al. Out-of-Domain Detection for Natural Language Understanding in Dialog Systems , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[59] Iryna Gurevych,et al. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[60] Kevin Gimpel,et al. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.