Do Neural Language Models Overcome Reporting Bias?

Mining commonsense knowledge from corpora suffers from reporting bias, over-representing the rare at the expense of the trivial (Gordon and Van Durme, 2013). We study to what extent pre-trained language models overcome this issue. We find that while their generalization capacity allows them to better estimate the plausibility of frequent but unspoken of actions, outcomes, and properties, they also tend to overestimate that of the very rare, amplifying the bias that already exists in their training corpus.

[1]  Rachel Rudinger,et al.  “You Are Grounded!”: Latent Name Artifacts in Pre-trained Language Models , 2020, EMNLP.

[2]  Emily M. Bender,et al.  Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data , 2020, ACL.

[3]  Ali Farhadi,et al.  From Recognition to Cognition: Visual Commonsense Reasoning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Marie-Francine Moens,et al.  Is an Image Worth More than a Thousand Words? On the Fine-Grain Semantic Differences between Visual and Linguistic Representations , 2016, COLING.

[5]  Alexander M. Rush,et al.  Commonsense Knowledge Mining from Pretrained Models , 2019, EMNLP.

[6]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.

[7]  Zornitsa Kozareva,et al.  SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning , 2011, *SEMEVAL.

[8]  Arjen van Dalen,et al.  Structural Bias in Cross-National Perspective: How Political Systems and Journalism Cultures Influence Government Dominance in the News , 2012 .

[9]  Yejin Choi,et al.  Do Neural Language Representations Learn Physical Commonsense? , 2019, CogSci.

[10]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[11]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[12]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[13]  Yejin Choi,et al.  Unsupervised Commonsense Question Answering with Self-Talk , 2020, EMNLP.

[14]  Benjamin Van Durme,et al.  Probing Neural Language Models for Human Tacit Assumptions , 2020 .

[15]  Hinrich Schutze,et al.  Negated LAMA: Birds cannot fly , 2019, ArXiv.

[16]  Benjamin Van Durme,et al.  Reporting bias and knowledge acquisition , 2013, AKBC '13.

[17]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[18]  Siobhan Chapman Logic and Conversation , 2005 .

[19]  Steven Schockaert,et al.  Inducing Relational Knowledge from BERT , 2019, AAAI.

[20]  Nanyun Peng,et al.  The Woman Worked as a Babysitter: On Biases in Language Generation , 2019, EMNLP.

[21]  Sameer Singh,et al.  Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling , 2019, ACL.

[22]  Lenhart K. Schubert,et al.  Discovering Commonsense Entailment Rules Implicit in Sentences , 2011, TextInfer@EMNLP.

[23]  Yejin Choi,et al.  The Curious Case of Neural Text Degeneration , 2019, ICLR.

[24]  Sebastian Riedel,et al.  Language Models as Knowledge Bases? , 2019, EMNLP.

[25]  Yejin Choi,et al.  ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning , 2019, AAAI.

[26]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[27]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[28]  Thomas G. Dietterich,et al.  Inverting Grice's Maxims to Learn Rules from Natural Language Extractions , 2011, NIPS.

[29]  Stephen Clark,et al.  Multi- and Cross-Modal Semantics Beyond Vision: Grounding in Auditory Perception , 2015, EMNLP.

[30]  Chandler May,et al.  On Measuring Social Biases in Sentence Encoders , 2019, NAACL.

[31]  Xinlei Chen,et al.  Never-Ending Learning , 2012, ECAI.

[32]  Yann Dauphin,et al.  Hierarchical Neural Story Generation , 2018, ACL.

[33]  Allyson Ettinger What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models , 2019, Transactions of the Association for Computational Linguistics.

[34]  Jacob Andreas,et al.  Experience Grounds Language , 2020, EMNLP.

[35]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.