A Generative Approach for Mitigating Structural Biases in Natural Language Inference

Many natural language inference (NLI) datasets contain biases that allow models to perform well by only using a biased subset of the input, without considering the remainder features. For instance, models are able to make a classification decision by only using the hypothesis, without learning the true relationship between it and the premise. These structural biases lead discriminative models to learn unintended superficial features and to generalize poorly out of the training distribution. In this work, we reformulate the NLI task as a generative task, where a model is conditioned on the biased subset of the input and the label and generates the remaining subset of the input. We show that by imposing a uniform prior, we obtain a provably unbiased model. Through synthetic experiments, we find that this approach is highly robust to large amounts of bias. We then demonstrate empirically on two types of natural bias that this approach leads to fully unbiased models in practice. However, we find that generative models are difficult to train and they generally perform worse than discriminative baselines. We highlight the difficulty of the generative modeling task in the context of NLI as a cause for this worse performance. Finally, by fine-tuning the generative model with a discriminative objective, we reduce the performance gap between the generative model and the discriminative baseline, while allowing for a small amount of bias.1

[1]  Lei Zheng,et al.  Texygen: A Benchmarking Platform for Text Generation Models , 2018, SIGIR.

[2]  Mike Lewis,et al.  Generative Question Answering: Learning to Answer the Whole Question , 2018, ICLR.

[3]  R. Thomas McCoy,et al.  Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference , 2019, ACL.

[4]  Zachary C. Lipton,et al.  How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks , 2018, EMNLP.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Dhruv Batra,et al.  Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[8]  Yonatan Belinkov,et al.  On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference , 2019, *SEMEVAL.

[9]  Omer Levy,et al.  Annotation Artifacts in Natural Language Inference Data , 2018, NAACL.

[10]  Yonatan Belinkov,et al.  Learning from others' mistakes: Avoiding dataset biases without modeling them , 2020, ICLR.

[11]  Pasquale Minervini,et al.  There is Strength in Numbers: Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training , 2020, EMNLP.

[12]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[13]  Iryna Gurevych,et al.  Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance , 2020, ACL.

[14]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[15]  Bilal Alsallakh,et al.  Captum: A unified and generic model interpretability library for PyTorch , 2020, ArXiv.

[16]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[17]  Rachel Rudinger,et al.  Hypothesis Only Baselines in Natural Language Inference , 2018, *SEMEVAL.

[18]  Thomas Lukasiewicz,et al.  e-SNLI: Natural Language Inference with Natural Language Explanations , 2018, NeurIPS.

[19]  Rico Sennrich,et al.  Controlling Politeness in Neural Machine Translation via Side Constraints , 2016, NAACL.

[20]  Masatoshi Tsuchiya,et al.  Performance Impact Caused by Hidden Bias of Training Data for Recognizing Textual Entailment , 2018, LREC.

[21]  Haohan Wang,et al.  Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual , 2019, EMNLP.

[22]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[23]  Regina Barzilay,et al.  Towards Debiasing Fact Verification Models , 2019, EMNLP.

[24]  Danna Gurari,et al.  Dataset bias: A case study for visual question answering , 2019, ASIST.

[25]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[26]  Iryna Gurevych,et al.  Towards Debiasing NLU Models from Unknown Biases , 2020, EMNLP.

[27]  Yonatan Belinkov,et al.  Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference , 2019, ACL.

[28]  Yonatan Belinkov,et al.  End-to-End Bias Mitigation by Modelling Biases in Corpora , 2020, ACL.