论文信息 - Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation

Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation

Word embedding has been found to be highly powerful to translate words from one language to another by a simple linear transform. However, we found some inconsistence among the objective functions of the embedding and the transform learning, as well as the distance measurement. This paper proposes a solution which normalizes the word vectors on a hypersphere and constrains the linear transform as an orthogonal transform. The experimental results confirmed that the proposed solution can offer better performance on a word similarity task and an English-toSpanish word translation task.

[1] Manaal Faruqui,et al. Improving Vector Space Word Representations Using Multilingual Correlation , 2014, EACL.

[2] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3] Ivan Titov,et al. Inducing Crosslingual Distributed Representations of Words , 2012, COLING.

[4] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[6] Geoffrey Zweig,et al. Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[7] Phil Blunsom,et al. Learning Bilingual Word Representations by Marginalizing Alignments , 2014, ACL.

[8] Quoc V. Le,et al. Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[9] Christopher D. Manning,et al. Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.

[10] Phil Blunsom,et al. Multilingual Distributed Representations without Word Alignment , 2013, ICLR 2014.

[11] Marco Baroni,et al. Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.

[12] Andrew Y. Ng,et al. Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[13] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[14] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[15] Ehud Rivlin,et al. Placing search in context: the concept revisited , 2002, TOIS.