How Can We Know What Language Models Know?

Abstract Recent work has presented intriguing results examining the knowledge contained in language models (LMs) by having the LM fill in the blanks of prompts such as “Obama is a __ by profession”. These prompts are usually manually created, and quite possibly sub-optimal; another prompt such as “Obama worked as a __ ” may result in more accurately predicting the correct profession. Because of this, given an inappropriate prompt, we might fail to retrieve facts that the LM does know, and thus any given prompt only provides a lower bound estimate of the knowledge contained in an LM. In this paper, we attempt to more accurately estimate the knowledge contained in LMs by automatically discovering better prompts to use in this querying process. Specifically, we propose mining-based and paraphrasing-based methods to automatically generate high-quality and diverse prompts, as well as ensemble methods to combine answers from different prompts. Extensive experiments on the LAMA benchmark for extracting relational knowledge from LMs demonstrate that our methods can improve accuracy from 31.1% to 39.6%, providing a tighter lower bound on what LMs know. We have released the code and the resulting LM Prompt And Query Archive (LPAQA) at https://github.com/jzbjyb/LPAQA.

[1]  Steven Schockaert,et al.  Inducing Relational Knowledge from BERT , 2019, AAAI.

[2]  Ulli Waltinger,et al.  BERT is Not a Knowledge Base (Yet): Factual Knowledge vs. Name-Based Reasoning in Unsupervised QA , 2019, ArXiv.

[3]  Noah A. Smith,et al.  Knowledge Enhanced Contextual Word Representations , 2019, EMNLP.

[4]  Sebastian Riedel,et al.  Language Models as Knowledge Bases? , 2019, EMNLP.

[5]  Graham Neubig,et al.  Latent Relation Language Models , 2019, AAAI.

[6]  Sameer Singh,et al.  Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.

[7]  Benoît Sagot,et al.  What Does BERT Learn about the Structure of Language? , 2019, ACL.

[8]  Myle Ott,et al.  Facebook FAIR’s WMT19 News Translation Task Submission , 2019, WMT.

[9]  Nelson F. Liu,et al.  Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling , 2019, ACL.

[10]  Jeffrey Ling,et al.  Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.

[11]  Richard Socher,et al.  Explain Yourself! Leveraging Language Models for Commonsense Reasoning , 2019, ACL.

[12]  Christopher D. Manning,et al.  A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.

[13]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[14]  Alex Wang,et al.  What do you learn from context? Probing for sentence structure in contextualized word representations , 2019, ICLR.

[15]  Dipanjan Das,et al.  BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.

[16]  Omer Levy,et al.  Mask-Predict: Parallel Decoding of Conditional Masked Language Models , 2019, EMNLP.

[17]  Yoav Goldberg,et al.  Assessing BERT's Syntactic Abilities , 2019, ArXiv.

[18]  Yonatan Belinkov,et al.  Analysis Methods in Neural Language Processing: A Survey , 2018, TACL.

[19]  Richard Socher,et al.  The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.

[20]  Quoc V. Le,et al.  A Simple Method for Commonsense Reasoning , 2018, ArXiv.

[21]  Christophe Gravier,et al.  T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples , 2018, LREC.

[22]  Yulia Tsvetkov,et al.  Style Transfer Through Back-Translation , 2018, ACL.

[23]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[24]  Dejing Dou,et al.  HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.

[25]  Richard Socher,et al.  Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[26]  Chris Dyer,et al.  On the State of the Art of Evaluation in Neural Language Models , 2017, ICLR.

[27]  James R. Glass,et al.  What do Neural Machine Translation Models Learn about Morphology? , 2017, ACL.

[28]  Mirella Lapata,et al.  Paraphrasing Revisited with Neural Machine Translation , 2017, EACL.

[29]  Gholamreza Haffari,et al.  Towards Decoding as Continuous Optimisation in Neural Machine Translation , 2017, EMNLP.

[30]  Daniel Jurafsky,et al.  Understanding Neural Networks through Representation Erasure , 2016, ArXiv.

[31]  Noah A. Smith,et al.  What Do Recurrent Neural Network Grammars Learn About Syntax? , 2016, EACL.

[32]  Wang Ling,et al.  Reference-Aware Language Models , 2016, EMNLP.

[33]  Emmanuel Dupoux,et al.  Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies , 2016, TACL.

[34]  Xing Shi,et al.  Does String-Based Neural MT Learn Source Syntax? , 2016, EMNLP.

[35]  Yoshua Bengio,et al.  A Neural Knowledge Language Model , 2016, ArXiv.

[36]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[37]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[38]  Quoc V. Le,et al.  Semi-supervised Sequence Learning , 2015, NIPS.

[39]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Geoffrey Zweig,et al.  Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[42]  Christopher J. C. Burges,et al.  The Microsoft Research Sentence Completion Challenge , 2011 .

[43]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[44]  Mausam,et al.  Open Information Extraction: The Second Generation , 2011, IJCAI.

[45]  Rahul Bhagat,et al.  Large Scale Acquisition of Paraphrases for Learning Surface Patterns , 2008, ACL.

[46]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[47]  Michael Gamon,et al.  Sentence-level MT evaluation without reference translations: beyond language modeling , 2005, EAMT.

[48]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[49]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[50]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[51]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[52]  Michael Gamon,et al.  Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[53]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[54]  Cai Li,et al.  Open Information Extraction , 2011 .

[55]  Ido Dagan,et al.  Investigating a Generic Paraphrase-Based Approach for Relation Extraction , 2006, EACL.

[56]  M. Pazienza Information Extraction , 1997 .

[57]  Dragomir R. Radev,et al.  of the Association for Computational Linguistics , 2022 .