Overcoming Conflicting Data when Updating a Neural Semantic Parser

In this paper, we explore how to use a small amount of new data to update a task-oriented semantic parsing model when the desired output for some examples has changed. When making updates in this way, one potential problem that arises is the presence of conflicting data, or out-of-date labels in the original training set. To evaluate the impact of this understudied problem, we propose an experimental setup for simulating changes to a neural semantic parser. We show that the presence of conflicting data greatly hinders learning of an update, then explore several methods to mitigate its effect. Our multi-task and data selection methods lead to large improvements in model accuracy compared to a naive data-mixing strategy, and our best method closes 86% of the accuracy gap between this baseline and an oracle upper bound.

[1]  Percy Liang,et al.  Data Recombination for Neural Semantic Parsing , 2016, ACL.

[2]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[3]  Yuxin Peng,et al.  Error-Driven Incremental Learning in Deep Convolutional Neural Network for Large-Scale Image Classification , 2014, ACM Multimedia.

[4]  Joan Bruna,et al.  Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[5]  Emilio Monti,et al.  Don’t Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing , 2020, WWW.

[6]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[7]  Razvan Pascanu,et al.  Progressive Neural Networks , 2016, ArXiv.

[8]  Xing Fan,et al.  Transfer Learning for Neural Semantic Parsing , 2017, Rep4NLP@ACL.

[9]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[10]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[11]  Mirella Lapata,et al.  Language to Logical Form with Neural Attention , 2016, ACL.

[12]  Guangquan Zhang,et al.  Learning under Concept Drift: A Review , 2019, IEEE Transactions on Knowledge and Data Engineering.

[13]  Derek Hoiem,et al.  Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Abhinav Gupta,et al.  Learning from Noisy Large-Scale Datasets with Minimal Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Sonal Gupta,et al.  Semantic Parsing for Task Oriented Dialog using Hierarchical Representations , 2018, EMNLP.

[17]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[18]  Alessandro Moschitti,et al.  Transfer Learning for Sequence Labeling Using Source Model and Target Data , 2019, AAAI.

[19]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.