A Multi-feature Classifier for Verbal Metaphor Identification in Russian Texts

The paper presents a supervised machine learning experiment with multiple features for identification of sentences containing verbal metaphors in raw Russian text. We introduce the custom-created training dataset, describe the feature engineering techniques, and discuss the results. The following set of features is applied: distributional semantic features, lexical and morphosyntactic co-occurrence frequencies, flag words, quotation marks, and sentence length. We combine these features into models of varying complexity; the results of the experiment demonstrate that fairly simple models based on lexical, morphosyntactic and semantic features are able to produce competitive results.

[1]  Xiaojin Zhu,et al.  Hunting Elusive Metaphors Using Lexical Resources. , 2007 .

[2]  Jonathan Dunn,et al.  Evaluating the Premises and Results of Four Metaphor Identification Systems , 2013, CICLing.

[3]  Shlomo Argamon,et al.  Automatic Identification of Conceptual Metaphors With Limited Knowledge , 2013, AAAI.

[4]  Beata Beigman Klebanov,et al.  Metaphor: A Computational Perspective by Tony Veale, Ekaterina Shutova and Beata Beigman Klebanov , 2016, CL.

[5]  John Bryant,et al.  Catching Metaphors , 2006 .

[6]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[7]  Zachary J. Mason CorMet: A Computational, Corpus-Based Conventional Metaphor Extraction System , 2004, CL.

[8]  Michael Mohler,et al.  Semantic Signatures for Example-Based Linguistic Metaphor Detection , 2013 .

[9]  Paula Pérez-Sobrino Gerard J. Steen, Aletta G. Dorst, J. Berenike Herrmann, Anna Kaal, Tina Krennmayr & Trijntje Pasma (2010). A Method for Linguistic Metaphor Identification: From MIP to MIPVU. , 2014 .

[10]  Andrey Kutuzov,et al.  WebVectors: A Toolkit for Building Web Interfaces for Vector Semantic Models , 2016, AIST.

[11]  Adam Kilgarriff,et al.  The Sketch Engine: ten years on , 2014 .

[12]  Philipp Koehn,et al.  Syntax-based Statistical Machine Translation , 2016, Synthesis Lectures on Human Language Technologies.

[13]  Natalia Levshina,et al.  How to do Linguistics with R: Data exploration and statistical analysis , 2015 .

[14]  Jean Maillard,et al.  Black Holes and White Rabbits: Metaphor Identification with Visual Features , 2016, NAACL.

[15]  Beata Beigman Klebanov,et al.  Semantic classifications for detection of verb metaphors , 2016, ACL.

[16]  Tomek Strzalkowski,et al.  Robust Extraction of Metaphor from Novel Data , 2013 .

[17]  Jonathan Dunn What metaphor identification systems can tell us about metaphor-in-language , 2013 .

[18]  Yulia Tsvetkov,et al.  Cross-Lingual Metaphor Detection Using Common Semantic Features , 2013 .

[19]  Ralph Weischedel,et al.  Automatic Extraction of Linguistic Metaphors with LDA Topic Modeling , 2013 .

[20]  Mark Last,et al.  Metaphor Identification in Large Texts Corpora , 2013, PloS one.

[21]  Yorick Wilks,et al.  Making Preferences More Active , 1978, Artif. Intell..

[22]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[23]  Yair Neuman,et al.  Literal and Metaphorical Sense Identification through Concrete and Abstract Context , 2011, EMNLP.

[24]  Yulia Tsvetkov,et al.  Metaphor Detection with Cross-Lingual Model Transfer , 2014, ACL.

[25]  A. Goatly The language of metaphors , 1997 .

[26]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[27]  Vladimir Zaytsev,et al.  Abductive Inference for Interpretation of Metaphors , 2014 .

[28]  Eduard Hovy,et al.  Identifying Metaphorical Word Use with Tree Kernels , 2013 .

[29]  Gerard J. Steen,et al.  A method for linguistic metaphor identification : from MIP to MIPVU , 2010 .

[30]  Zeno Vendler,et al.  Verbs and Times , 1957, The Language of Time - A Reader.

[31]  Bryan Rink,et al.  A Novel Distributional Approach to Multilingual Conceptual Metaphor Recognition , 2014, COLING.

[32]  Caroline Sporleder,et al.  Using Gaussian Mixture Models to Detect Figurative Language in Context , 2010, NAACL.

[33]  Caroline Sporleder,et al.  Classifier Combination for Contextual Idiom Detection Without Labelled Data , 2009, EMNLP.

[34]  Wim Peters,et al.  Lexicalised Systematic Polysemy in WordNet , 2000, LREC.

[35]  Beata Beigman Klebanov,et al.  Different Texts, Same Metaphors: Unigrams and Beyond , 2014 .

[36]  Ekaterina Kochmar,et al.  ‘Calling on the classical phone’: a distributional model of adjective-noun errors in learners’ English , 2016, COLING.