Unsupervised Natural Language Inference Using PHL Triplet Generation

Transformer-based models achieve impressive performance on numerous Natural Language Inference (NLI) benchmarks when trained on respective training datasets. However, in certain cases, training samples may not be available or collecting them could be time-consuming and resource-intensive. In this work, we address the above challenge and present an explorative study on unsupervised NLI, a paradigm in which no human-annotated training samples are available. We investigate it under three settings: PH, P, and NPH that differ in the extent of unlabeled data available for learning. As a solution, we propose a procedural data generation approach that leverages a set of sentence transformations to collect PHL (Premise, Hypothesis, Label) triplets for training NLI models, bypassing the need for human-annotated training data. Comprehensive experiments with several NLI datasets show that the proposed approach results in accuracies of up to 66.75%, 65.9%, 65.39% in PH, P, and NPH settings respectively, outperforming all existing unsupervised baselines. Furthermore, fine-tuning our model with as little as ~0.1% of the human-annotated training dataset (500 instances) leads to 12.2% higher accuracy than the model trained from scratch on the same 500 instances. Supported by this superior performance, we conclude with a recommendation for collecting high-quality task-specific data.

[1]  Yezhou Yang,et al.  Semantically Distributed Robust Optimization for Vision-and-Language Inference , 2021, FINDINGS.

[2]  Orhan Firat,et al.  Towards Zero-Label Language Learning , 2021, ArXiv.

[3]  Douwe Kiela,et al.  Human-Adversarial Visual Question Answering , 2021, NeurIPS.

[4]  Zhe Gan,et al.  Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  Pratyay Banerjee,et al.  Unsupervised Pronoun Resolution via Masked Noun-Phrase Prediction , 2021, ACL.

[6]  Xiaodong Liu,et al.  Targeted Adversarial Training for Natural Language Understanding , 2021, NAACL.

[7]  Zhiyi Ma,et al.  Dynabench: Rethinking Benchmarking in NLP , 2021, NAACL.

[8]  Chitta Baral,et al.  Self-Supervised Test-Time Learning for Reading Comprehension , 2021, NAACL.

[9]  Tejas Gokhale,et al.  WeaQA: Weak Supervision via Captions for Visual Question Answering , 2020, FINDINGS.

[10]  Wei Wang,et al.  Unsupervised Natural Language Inference via Decoupled Multimodal Contrastive Learning , 2020, EMNLP.

[11]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[12]  Chitta Baral,et al.  Self-Supervised Knowledge Triplet Learning for Zero-shot Question Answering , 2020, EMNLP.

[13]  Ramesh Nallapati,et al.  Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering , 2020, ACL.

[14]  Chitta Baral,et al.  Enhancing Natural Language Inference Using New and Expanded Training Data Sets and New Learning Models , 2020, AAAI.

[15]  Mohammad Shoeybi,et al.  Training Question Answering Models from Synthetic Data , 2020, EMNLP.

[16]  Chitta Baral,et al.  VQA-LOL: Visual Question Answering under the Lens of Logic , 2020, European Conference on Computer Vision.

[17]  Peter J. Liu,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2019, ICML.

[18]  Robert E. Mercer,et al.  Computational Linguistics: 16th International Conference of the Pacific Association for Computational Linguistics, PACLING 2019, Hanoi, Vietnam, October 11–13, 2019, Revised Selected Papers , 2020, Introducing Linguistics.

[19]  J. Weston,et al.  Adversarial NLI: A New Benchmark for Natural Language Understanding , 2019, ACL.

[20]  Mohit Bansal,et al.  LXMERT: Learning Cross-Modality Encoder Representations from Transformers , 2019, EMNLP.

[21]  Noel E. O'Connor,et al.  Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).

[22]  Stefan Lee,et al.  ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks , 2019, NeurIPS.

[23]  Ludovic Denoyer,et al.  Unsupervised Question Answering by Cloze Translation , 2019, ACL.

[24]  Omer Levy,et al.  SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.

[25]  Jason Weston,et al.  Dialogue Natural Language Inference , 2018, ACL.

[26]  Yoav Goldberg,et al.  Breaking NLI Systems with Sentences that Require Simple Lexical Inferences , 2018, ACL.

[27]  Bhuwan Dhingra,et al.  Simple and Effective Semi-Supervised Question Answering , 2018, NAACL.

[28]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[29]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[30]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[31]  Nathanael Chambers,et al.  A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.

[32]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[33]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[34]  Svetlana Lazebnik,et al.  Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, International Journal of Computer Vision.

[35]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[36]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[37]  Stephen Grossberg,et al.  The ART of adaptive pattern recognition by a self-organizing neural network , 1988, Computer.

[38]  Anjana Arunkumar Real-Time Visual Feedback for Educative Benchmark Creation : A Human-and-Metric-in-the-Loop Workflow , 2020 .

[39]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[40]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .