Deep-learning the Ropes: Modeling Idiomaticity with Neural Networks

English. In this work we explore the possibility of training a neural network to classify and rank idiomatic expressions under constraints of data scarcity. We discuss our results comparing them both to other unsupervised models designed to perform idiom detection and to similar supervised classifiers trained to detect metaphoric bigrams. Italiano. In questo lavoro esploriamo la possibilità di addestrare una rete neurale per classificare ed ordinare espressioni idiomatiche in condizioni di scarsità di dati. I nostri risultati sono discussi in comparazione sia con altri algoritmi non supervisionati ideati per l’identificazione di espressioni idiomatiche sia con classificatori supervisionati dello stesso tipo addestrati per identificare bigrammi metaforici.

[1]  Antoine Doucet,et al.  Neural Networks for Multi-Word Expression Detection , 2017, MWE@EACL.

[2]  Yan Liu,et al.  Not Enough Data?: Joint Inferring Multiple Diffusion Networks via Network Generation Priors , 2017, WSDM.

[3]  Stephen Clark,et al.  RELPRON: A Relative Clause Evaluation Data Set for Compositional Distributional Semantics , 2016, CL.

[4]  Alessandro Lenci,et al.  Lexical Variability and Compositionality: Investigating Idiomaticity with Distributional Semantic Models , 2016, MWE@ACL.

[5]  Carlos Ramisch,et al.  Predicting the Compositionality of Nominal Compounds: Giving Word Embeddings a Hard Time , 2016, ACL.

[6]  Ronan Collobert,et al.  Phrase Representations for Multiword Expressions , 2016, MWE@ACL.

[7]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[8]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[9]  I. Sag,et al.  Idioms , 2015 .

[10]  Georgiana Dinu,et al.  Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Karel Jezek,et al.  Determining Compositionality of Expresssions Using Various Word Space Models and Methods , 2013, CVSM@ACL.

[13]  Ludovic Tanguy,et al.  Authorship Attribution: Using Rich Linguistic Features when Training Data is Scarce , 2012, CLEF.

[14]  A. Jacobs,et al.  Looking at the brains behind figurative language—A quantitative meta-analysis of neuroimaging studies on metaphor, idiom, and irony processing , 2012, Neuropsychologia.

[15]  Mirella Lapata,et al.  A Comparison of Vector-based Representations for Semantic Composition , 2012, EMNLP.

[16]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[17]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[18]  Afsaneh Fazly,et al.  Unsupervised Type and Token Identification of Idiomatic Expressions , 2009, CL.

[19]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[20]  Stefanie Wulff,et al.  Rethinking Idiomaticity: A Usage-based Approach , 2009 .

[21]  Dilin Liu,et al.  The Most Frequently Used Spoken American English Idioms: A Corpus Analysis and Its Implications. , 2003 .

[22]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[23]  Cristina Cacciari,et al.  Semantic productivity and idiom comprehension. , 1994 .

[24]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[25]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[26]  G. Lakoff,et al.  Metaphors We Live by , 1982 .

[27]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.

[28]  Stergios Chatzikyriakidis,et al.  “Deep” Learning : Detecting Metaphoricity in Adjective-Noun Pairs , 2017 .

[29]  Alessandro Lenci,et al.  Determining the Compositionality of Noun-Adjective Pairs with Lexical Variants and Distributional Semantics , 2016, CLiC-it/EVALITA.

[30]  Enrico Torre The emergent patterns of Italian idioms:a dynamic-systems approach , 2014 .

[31]  Alessandro Lenci,et al.  Distributional semantics in linguistic and cognitive research , 2008 .

[32]  Afsaneh Fazly,et al.  A distributional account of the semantics of multiword expressions , 2008 .

[33]  Cristina Cacciari,et al.  Understanding idiomatic expressions. The contribution of word meanings , 1991 .

[34]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[35]  Carlo Lapucci Dizionario dei modi di dire della lingua italiana , 1979 .