Verb Sense Clustering using Contextualized Word Representations for Semantic Frame Induction

Contextualized word representations have proven useful for various natural language processing tasks. However, it remains unclear to what extent these representations can cover hand-coded semantic information such as semantic frames, which specify the semantic role of the arguments associated with a predicate. In this paper, we focus on verbs that evoke different frames depending on the context, and we investigate how well contextualized word representations can recognize the difference of frames that the same verb evokes. We also explore which types of representation are suitable for semantic frame induction. In our experiments, we compare seven different contextualized word representations for two English frame-semantic resources, FrameNet and PropBank. We demonstrate that several contextualized word representations, especially BERT and its variants, are considerably informative for semantic frame induction. Furthermore, we examine the extent to which the contextualized representation of a verb can estimate the number of frames that the verb can evoke.

[1]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[2]  Alexander Panchenko,et al.  Neural GRANNy at SemEval-2019 Task 2: A combined approach for better modeling of semantic relationships in semantic frame induction , 2019, SemEval@NAACL-HLT.

[3]  Tie-Yan Liu,et al.  Incorporating BERT into Neural Machine Translation , 2020, ICLR.

[4]  D. Aldous Exchangeability and related topics , 1985 .

[5]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[6]  Jirí Materna,et al.  LDA-Frames: An Unsupervised Approach to Generating Semantic Frames , 2012, CICLing.

[7]  Hwee Tou Ng,et al.  Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations , 2019, EMNLP.

[8]  Laura Kallmeyer,et al.  SemEval-2019 Task 2: Unsupervised Lexical Frame Induction , 2019, *SEMEVAL.

[9]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[10]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[11]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[12]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[13]  Yang Liu,et al.  Fine-tune BERT for Extractive Summarization , 2019, ArXiv.

[14]  Tiago Timponi Torrent,et al.  Semi-supervised Deep Embedded Clustering with Anomaly Detection for Semantic Frame Induction , 2020, LREC.

[15]  Yoav Goldberg,et al.  Word Sense Induction with Neural biLM and Symmetric Patterns , 2018, EMNLP.

[16]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[17]  Jackie Chi Kit Cheung,et al.  Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain , 2013, ACL.

[18]  Tom M. Mitchell,et al.  A Joint Sequential and Relational Model for Frame-Semantic Parsing , 2017, EMNLP.

[19]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[20]  Christian Biemann,et al.  Unsupervised Semantic Frame Induction using Triclustering , 2018, ACL.

[21]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[22]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[23]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[24]  Ricardo Ribeiro,et al.  L2F/INESC-ID at SemEval-2019 Task 2: Unsupervised Lexical Semantic Frame Induction using Contextualized Word Representations , 2019, SemEval@NAACL-HLT.

[25]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[26]  Diego Reforgiato Recupero,et al.  Identifying motifs for evaluating open knowledge extraction on the Web , 2016, Knowl. Based Syst..

[27]  Christian Biemann,et al.  Chinese Whispers - an Efficient Graph Clustering Algorithm and its Application to Natural Language Processing Problems , 2006 .

[28]  Martha Palmer,et al.  Inducing Example-based Semantic Frames from a Massive Amount of Verb Uses , 2014, EACL.

[29]  Dmitry Ustalov,et al.  HHMM at SemEval-2019 Task 2: Unsupervised Frame Induction using Contextualized Word Embeddings , 2019, *SEMEVAL.

[30]  Miriam R. L. Petruck FRAME SEMANTICS , 1996 .

[31]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[32]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.