GPT Perdetry Test: Generating new meanings for new words

Human innovation in language, such as inventing new words, is a challenge for pretrained language models. We assess the ability of one large model, GPT-3, to process new words and decide on their meaning. We create a set of nonce words and prompt GPT-3 to generate their dictionary definitions. We find GPT-3 produces plausible definitions that align with human judgments. Moreover, GPT-3’s definitions are sometimes preferred to those invented by humans, signaling its intriguing ability not just to adapt, but to add to the evolving vocabulary of the English language.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Gerard de Melo Etymological Wordnet: Tracing The History of Words , 2014, LREC.

[3]  Richard R. Klink Creating Brand Names With Meaning: The Use of Sound Symbolism , 2000 .

[4]  Ryan Cotterell,et al.  Predicting Declension Class from Form and Meaning , 2020, ACL.

[5]  J. Nuckolls THE CASE FOR SOUND SYMBOLISM , 1999 .

[6]  J. Grieve,et al.  Mapping Lexical Innovation on American Social Media , 2018, Journal of English Linguistics.

[7]  Jim Feist “Sound symbolism” in English ☆ , 2013 .

[8]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[9]  Yoshua Bengio,et al.  Learning to Understand Phrases by Embedding the Dictionary , 2015, TACL.

[10]  John J. Ohala,et al.  Sound symbolism: English , 1995 .

[11]  Aasish Pappu,et al.  Unsupervised Neologism Normalization Using Embedding Space Mapping , 2019, EMNLP.

[12]  Morten H. Christiansen,et al.  Sound–meaning association biases evidenced across thousands of languages , 2016, Proceedings of the National Academy of Sciences.

[13]  Yulia Tsvetkov,et al.  Where New Words Are Born: Distributional Semantic Analysis of Neologisms and Their Semantic Neighborhoods , 2020, SCIL.

[14]  Hans Marchand Phonetic symbolism in English word-formation , 1959 .

[15]  David Yarowsky,et al.  Computational Etymology and Word Emergence , 2020, LREC.

[16]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[17]  Desislava Zhekova,et al.  Using the Web and Social Media as Corpora for Monitoring the Spread of Neologisms. The case of 'rapefugee', 'rapeugee', and 'rapugee' , 2016, WAC@ACL.

[18]  Doug Downey,et al.  Definition Modeling: Learning to Define Word Embeddings in Natural Language , 2016, AAAI.

[19]  F. D. Saussure Cours de linguistique générale , 1924 .

[20]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.