Towards Debiasing NLU Models from Unknown Biases
暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[2] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[3] R. Thomas McCoy,et al. Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference , 2019, ACL.
[4] Yejin Choi,et al. WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale , 2020, AAAI.
[5] Carolyn Penstein Rosé,et al. Stress Test Evaluation for Natural Language Inference , 2018, COLING.
[6] Peter Clark,et al. The Seventh PASCAL Recognizing Textual Entailment Challenge , 2011, TAC.
[7] Timothy J. Hazen,et al. Robust Natural Language Inference Models with Example Forgetting , 2019, ArXiv.
[8] Lifu Tu,et al. An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models , 2020, Transactions of the Association for Computational Linguistics.
[9] Yejin Choi,et al. The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task , 2017, CoNLL.
[10] Zachary Chase Lipton,et al. Born Again Neural Networks , 2018, ICML.
[11] Dirk Hovy,et al. Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview , 2019, ACL.
[12] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[13] Zachary C. Lipton,et al. How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks , 2018, EMNLP.
[14] Yoav Goldberg,et al. Adversarial Removal of Demographic Attributes from Text Data , 2018, EMNLP.
[15] Marius Mosbach,et al. On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines , 2020, ArXiv.
[16] James Henderson,et al. Simple but effective techniques to reduce biases , 2019, ArXiv.
[17] R. Thomas McCoy,et al. BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance , 2020, BLACKBOXNLP.
[18] Reut Tsarfaty,et al. Evaluating NLP Models via Contrast Sets , 2020, ArXiv.
[19] Jason Baldridge,et al. PAWS: Paraphrase Adversaries from Word Scrambling , 2019, NAACL.
[20] Eduard Hovy,et al. Learning the Difference that Makes a Difference with Counterfactually-Augmented Data , 2020, ICLR.
[21] Ali Farhadi,et al. HellaSwag: Can a Machine Really Finish Your Sentence? , 2019, ACL.
[22] Rachel Rudinger,et al. Hypothesis Only Baselines in Natural Language Inference , 2018, *SEMEVAL.
[23] Noah D. Goodman,et al. Evaluating Compositionality in Sentence Embeddings , 2018, CogSci.
[24] Luke Zettlemoyer,et al. Don’t Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset Biases , 2019, EMNLP.
[25] Yonatan Belinkov,et al. On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference , 2019, *SEMEVAL.
[26] Yoav Goldberg,et al. Breaking NLI Systems with Sentences that Require Simple Lexical Inferences , 2018, ACL.
[27] Percy Liang,et al. Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.
[28] Anton van den Hengel,et al. On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law , 2020, NeurIPS.
[29] Peter Clark,et al. SciTaiL: A Textual Entailment Dataset from Science Question Answering , 2018, AAAI.
[30] James Allen,et al. Tackling the Story Ending Biases in The Story Cloze Test , 2018, ACL.
[31] Ronan Le Bras,et al. Adversarial Filters of Dataset Biases , 2020, ICML.
[32] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[33] Lifu Tu,et al. Pay Attention to the Ending:Strong Neural Baselines for the ROC Story Cloze Task , 2017, ACL.
[34] Hao Tan,et al. The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions , 2020, EMNLP.
[35] Yejin Choi,et al. SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference , 2018, EMNLP.
[36] Iryna Gurevych,et al. Improving QA Generalization by Concurrent Modeling of Multiple Biases , 2020, FINDINGS.
[37] Masatoshi Tsuchiya,et al. Performance Impact Caused by Hidden Bias of Training Data for Recognizing Textual Entailment , 2018, LREC.
[38] Iryna Gurevych,et al. Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance , 2020, ACL.
[39] Mohit Bansal,et al. Adversarial NLI: A New Benchmark for Natural Language Understanding , 2020, ACL.
[40] Marco Marelli,et al. A SICK cure for the evaluation of compositional distributional semantic models , 2014, LREC.
[41] Hung-Yu Kao,et al. Probing Neural Network Comprehension of Natural Language Arguments , 2019, ACL.
[42] Thomas Wolf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[43] Arul Menezes,et al. Effectively Using Syntax for Recognizing False Entailment , 2006, NAACL.
[44] Lucy Vanderwende,et al. What Syntax Can Contribute in the Entailment Task , 2005, MLCW.
[45] Haohan Wang,et al. Unlearn Dataset Bias in Natural Language Inference by Fitting the Residual , 2019, EMNLP.
[46] Kilian Q. Weinberger,et al. Revisiting Few-sample BERT Fine-tuning , 2020, ArXiv.
[47] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[48] Mohit Bansal,et al. Analyzing Compositionality-Sensitivity of NLI Models , 2018, AAAI.
[49] Roy Bar-Haim,et al. The Second PASCAL Recognising Textual Entailment Challenge , 2006 .
[50] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[51] Andreas Vlachos,et al. The Fact Extraction and VERification (FEVER) Shared Task , 2018, FEVER@EMNLP.
[52] Yonatan Belinkov,et al. Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference , 2019, ACL.
[53] Yonatan Belinkov,et al. End-to-End Bias Mitigation by Modelling Biases in Corpora , 2020, ACL.
[54] Sheng Liu,et al. Early-Learning Regularization Prevents Memorization of Noisy Labels , 2020, NeurIPS.
[55] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[56] Regina Barzilay,et al. Towards Debiasing Fact Verification Models , 2019, EMNLP.
[57] Omer Levy,et al. Annotation Artifacts in Natural Language Inference Data , 2018, NAACL.