LIPN-IIMAS at SemEval-2017 Task 1: Subword Embeddings, Attention Recurrent Neural Networks and Cross Word Alignment for Semantic Textual Similarity

In this paper we report our attempt to use, on the one hand, state-of-the-art neural approaches that are proposed to measure Semantic Textual Similarity (STS). On the other hand, we propose an unsupervised cross-word alignment approach, which is linguistically motivated. The neural approaches proposed herein are divided into two main stages. The first stage deals with constructing neural word embeddings, the components of sentence embeddings. The second stage deals with constructing a semantic similarity function relating pairs of sentence embeddings. Unfortunately our competition results were poor in all tracks, therefore we concentrated our research to improve them for Track 5 (EN-EN).

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Denilson Barbosa,et al.  Open Information Extraction with Tree Kernels , 2013, NAACL.

[3]  Eneko Agirre,et al.  SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation , 2016, *SEMEVAL.

[4]  Piotr Andruszkiewicz,et al.  Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity. , 2016, *SEMEVAL.

[5]  Eneko Agirre,et al.  SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity , 2012, *SEMEVAL.

[6]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[7]  Jonathan Weese,et al.  UMBC_EBIQUITY-CORE: Semantic Textual Similarity Systems , 2013, *SEMEVAL.

[8]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[9]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[10]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[11]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[12]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[13]  Zellig S. Harris,et al.  Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[14]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.