Contextualized Translations of Phrasal Verbs with Distributional Compositional Semantics and Monolingual Corpora

This paper describes a compositional distributional method to generate contextualized senses of words and identify their appropriate translations in the target language using monolingual corpora. Word translation is modeled in the same way as contextualization of word meaning, but in a bilingual vector space. The contextualization of meaning is carried out by means of distributional composition within a structured vector space with syntactic dependencies, while the bilingual space is created by means of transfer rules and a bilingual dictionary. A phrase in the source language, consisting of a head and a dependent, is translated into the target language by selecting both the nearest neighbor of the head given the dependent, and the nearest neighbor of the dependent given the head. This process is expanded to larger phrases by means of incremental composition. Experiments were performed on English and Spanish monolingual corpora in order to translate phrasal verbs in context. A new bilingual dataset to evaluate strategies aimed at translating phrasal verbs in restricted syntactic domains has been created and released.

[1]  Claire Cardie,et al.  Deep Recursive Neural Networks for Compositionality in Language , 2014, NIPS.

[2]  Emmanuel Morin,et al.  Improving Bilingual Lexicon Extraction from Comparable Corpora Using Window-Based and Syntax-Based Models , 2014, CICLing.

[3]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[4]  Sylvain Kahane,et al.  The Meaning-Text Theory , 2003 .

[5]  Guillaume Lample,et al.  Phrase-Based & Neural Unsupervised Machine Translation , 2018, EMNLP.

[6]  Taro Watanabe,et al.  Bilingual Lexicon Extraction from Comparable Corpora Using Label Propagation , 2012, EMNLP.

[7]  Eneko Agirre,et al.  SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation , 2016, *SEMEVAL.

[8]  Stephen Clark,et al.  Concrete Sentence Spaces for Compositional Distributional Models of Meaning , 2010, IWCS.

[9]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[10]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[11]  Emmanuel Morin,et al.  Revising the Compositional Method for Terminology Acquisition from Comparable Corpora , 2012, COLING.

[12]  Mehdi Sheikhalishahi,et al.  Combining Different Seed Dictionaries to Extract Lexicon from Comparable Corpus , 2014 .

[13]  Pablo Gamallo,et al.  LinguaKit: A Big Data-Based Multilingual Tool for Linguistic Analysis and Information Extraction , 2018, 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS).

[14]  Tomas Brychcin,et al.  UWB at SemEval-2016 Task 1: Semantic Textual Similarity using Lexical, Syntactic, and Semantic Information , 2016, *SEMEVAL.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[17]  Eneko Agirre,et al.  Learning principled bilingual mappings of word embeddings while preserving monolingual invariance , 2016, EMNLP.

[18]  Georges Linarès,et al.  A Multi-view Approach for Term Translation Spotting , 2011, CICLing.

[19]  Katrin Erk,et al.  A Flexible, Corpus-Driven Model of Regular and Inverse Selectional Preferences , 2010, CL.

[20]  Peter D. Turney Domain and Function: A Dual-Space Model of Semantic Relations and Compositions , 2012, J. Artif. Intell. Res..

[21]  Pablo Gamallo Comparing explicit and predictive distributional semantic models endowed with syntactic contexts , 2017, Lang. Resour. Evaluation.

[22]  Gregory Grefenstette,et al.  The World Wide Web as a Resource for Example-Based Machine Translation Tasks , 1999, TC.

[23]  Suresh Manandhar,et al.  Bilingual lexicon extraction from comparable corpora using in-domain terms , 2010, COLING.

[24]  Pablo Gamallo,et al.  Is singular value decomposition useful for word similarity extraction? , 2011, Lang. Resour. Evaluation.

[25]  P. Resnik Selectional constraints: an information-theoretic model and its computational realization , 1996, Cognition.

[26]  Richard Socher,et al.  Learned in Translation: Contextualized Word Vectors , 2017, NIPS.

[27]  David J. Weir,et al.  Improving Semantic Composition with Offset Inference , 2017, ACL.

[28]  José Ramom Pichel Campos,et al.  Learning Spanish-Galician Translation Equivalents Using a Comparable Corpus and a Bilingual Dictionary , 2008, CICLing.

[29]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[30]  Eneko Agirre,et al.  Learning bilingual word embeddings with (almost) no bilingual data , 2017, ACL.

[31]  Mirella Lapata,et al.  Language Models Based on Semantic Composition , 2009, EMNLP.

[32]  Richard Hudson,et al.  The psychological reality of syntactic dependency relations , 2003 .

[33]  Dimitri Kartsaklis,et al.  Investigating the Role of Prior Disambiguation in Deep-learning Compositional Models of Meaning , 2014, ArXiv.

[34]  Angeliki Lazaridou,et al.  Jointly optimizing word representations for lexical and sentential tasks with the C-PHRASE model , 2015, ACL.

[35]  Stefan Thater,et al.  Contextualizing Semantic Representations Using Syntactically Enriched Vector Models , 2010, ACL.

[36]  Pablo Gamallo,et al.  The role of syntactic dependencies in compositional distributional semantics , 2017 .

[37]  Dimitri Kartsaklis,et al.  Syntax-Aware Multi-Sense Word Embeddings for Deep Compositional Models of Meaning , 2015, EMNLP.

[38]  Ahmet Aker,et al.  Extracting bilingual terminologies from comparable corpora , 2013, ACL.

[39]  Chris Callison-Burch,et al.  Learning Translations via Matrix Completion , 2017, EMNLP.

[40]  Kazuma Hashimoto,et al.  Learning Embeddings for Transitive Verb Disambiguation by Implicit Tensor Factorization , 2015 .

[41]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[42]  Stephen Clark,et al.  Mathematical Foundations for a Compositional Distributional Model of Meaning , 2010, ArXiv.

[43]  Pierre Zweigenbaum,et al.  Looking for Candidate Translational Equivalents in Specialized, Comparable Corpora , 2002, COLING.

[44]  Marco Baroni,et al.  Composition in Distributional Semantics , 2013, Lang. Linguistics Compass.

[45]  Tom M. Mitchell,et al.  Vector Space Semantic Parsing: A Framework for Compositional Vector Space Models , 2013, CVSM@ACL.

[46]  Timothy Baldwin,et al.  Noun-Noun Compound Machine Translation A Feasibility Study on Shallow Processing , 2003, Proceedings of the ACL 2003 workshop on Multiword expressions analysis, acquisition and treatment -.

[47]  H. Boas Contrastive Studies in Construction Grammar , 2010 .

[48]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[49]  Pablo Gamallo Citius at SemEval-2017 Task 2: Cross-Lingual Similarity from Comparable Corpora and Dependency-Based Contexts , 2017, SemEval@ACL.

[50]  Reinhard Rapp,et al.  Automatic Identification of Word Translations from Unrelated English and German Corpora , 1999, ACL.

[51]  Marco Baroni,et al.  Frege in Space: A Program of Compositional Distributional Semantics , 2014 .

[52]  Claire Lemaire,et al.  Extraction of Domain-Specific Bilingual Lexicon from Comparable Corpora: Compositional Translation and Ranking , 2012, COLING.

[53]  Katrin Erk,et al.  A Structured Vector Space Model for Word Meaning in Context , 2008, EMNLP.

[54]  Eneko Agirre,et al.  Unsupervised Neural Machine Translation , 2017, ICLR.

[55]  E. Guevara A Regression Model of Adjective-Noun Compositionality in Distributional Semantics , 2010 .

[56]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[57]  Ido Dagan Lexical Disambiguation: Sources of Information and their Statistical Realization , 1991, ACL.

[58]  Eneko Agirre,et al.  Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations , 2018, AAAI.

[59]  Hwee Tou Ng,et al.  Mining New Word Translations from Comparable Corpora , 2004, COLING.

[60]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[61]  Pablo Gamallo,et al.  Dependency parsing with finite state transducers and compression rules , 2018, Inf. Process. Manag..

[62]  Pascale Fung,et al.  An IR Approach for Translating New Words from Nonparallel, Comparable Texts , 1998, ACL.

[63]  Marco Baroni,et al.  Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.

[64]  Ioannis Korkontzelos,et al.  Estimating Linear Models for Compositional Distributional Semantics , 2010, COLING.

[65]  Dan Roth,et al.  Robust Cross-lingual Hypernymy Detection using Dependency Context , 2018, NAACL-HLT.

[66]  Kun Yu,et al.  Extracting Bilingual Dictionary from Comparable Corpora with Dependency Heterogeneity , 2009, HLT-NAACL.

[67]  David J. Weir,et al.  Aligning Packed Dependency Trees: A Theory of Composition for Distributional Semantics , 2016, CL.

[68]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.