Adversarial Training for Commonsense Inference

We propose an AdversariaL training algorithm for commonsense InferenCE (ALICE). We apply small perturbations to word embeddings and minimize the resultant adversarial risk to regularize the model. We exploit a novel combination of two different approaches to estimate these perturbations: 1) using the true label and 2) using the model prediction. Without relying on any human-crafted features, knowledge bases, or additional datasets other than the target datasets, our model boosts the fine-tuning performance of RoBERTa, achieving competitive results on multiple reading comprehension datasets that require commonsense inference.

[1]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[2]  Yash Jain,et al.  KARNA at COIN Shared Task 1: Bidirectional Encoder Representations from Transformers with relational knowledge for machine comprehension with common sense , 2019, EMNLP.

[3]  Guotong Xie,et al.  Pingan Smart Health and SJTU at COIN - Shared Task: utilizing Pre-trained Language Models and Common-sense Knowledge in Machine Reading Tasks , 2019, EMNLP.

[4]  Xiaodong Liu,et al.  Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.

[5]  Simon Ostermann,et al.  MCScript2.0: A Machine Comprehension Corpus Focused on Script Events and Participants , 2019, *SEMEVAL.

[6]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[7]  Ruize Wang,et al.  K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters , 2020, ArXiv.

[8]  庄于雨 Going on Vacation , 2010 .

[9]  Jianfeng Gao,et al.  The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding , 2020, ACL.

[10]  Dan Roth,et al.  “Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding , 2019, EMNLP.

[11]  Peter Szolovits,et al.  Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment , 2019, ArXiv.

[12]  Pushmeet Kohli,et al.  Adversarial Robustness through Local Linearization , 2019, NeurIPS.

[13]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[14]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[15]  Jeff Da Jeff Da at COIN - Shared Task , 2019, Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing.

[16]  Tom Goldstein,et al.  FreeLB: Enhanced Adversarial Training for Language Understanding , 2019, ICLR 2020.

[17]  Chunhua Liu,et al.  BLCU-NLP at COIN-Shared Task1: Stagewise Fine-tuning BERT for Commonsense Inference in Everyday Narrations , 2019, EMNLP.

[18]  Yejin Choi,et al.  Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning , 2019, EMNLP.

[19]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Guokun Lai,et al.  RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[22]  Yejin Choi,et al.  SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference , 2018, EMNLP.

[23]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[24]  Xiaodong Liu,et al.  SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization , 2020, ACL.

[25]  Simon Ostermann,et al.  Commonsense Inference in Natural Language Processing (COIN) - Shared Task Report , 2019, EMNLP.

[26]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[27]  Prakhar Sharma,et al.  IIT-KGP at COIN 2019: Using pre-trained Language Models for modeling Machine Comprehension , 2019, EMNLP.