论文信息 - Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features - 字舞流文

Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simple but efficient unsupervised objective to train distributed representations of sentences. Our method outperforms the state-of-the-art unsupervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.

Matteo Pagliardini | Martin Jaggi | Prakhar Gupta | Martin Jaggi | Matteo Pagliardini | Prakhar Gupta

[1] K. Pearson. VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.

[2] Max F. Meyer,et al. The Proof and Measurement of Association between Two Things. , 1904 .

[3] Hans Peter Luhn,et al. The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[4] R. Rockafellar. Monotone Operators and the Proximal Point Algorithm , 1976 .

[5] Ellen M. Voorhees,et al. Overview of the TREC 2004 Novelty Track. , 2005 .

[6] Bo Pang,et al. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[7] Chris Quirk,et al. Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources , 2004, COLING.

[8] Bing Liu,et al. Mining and summarizing customer reviews , 2004, KDD.

[9] Claire Cardie,et al. Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[10] Bo Pang,et al. Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[11] Kilian Q. Weinberger,et al. Feature hashing for large scale multitask learning , 2009, ICML '09.

[12] C. Spearman. The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[13] Wiebke Wagner,et al. Steven Bird, Ewan Klein and Edward Loper: Natural Language Processing with Python, Analyzing Text with the Natural Language Toolkit , 2010, Lang. Resour. Evaluation.

[14] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[15] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16] Chris Callison-Burch,et al. PPDB: The Paraphrase Database , 2013, NAACL.

[17] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[18] Claire Cardie,et al. SemEval-2014 Task 10: Multilingual Semantic Textual Similarity , 2014, *SEMEVAL.

[19] Omer Levy,et al. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[20] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[21] Marco Marelli,et al. A SICK cure for the evaluation of compositional distributional semantic models , 2014, LREC.

[22] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[23] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.

[24] Lior Wolf,et al. In Defense of Word Embedding for Generic Text Representation , 2015, NLDB.

[25] Angeliki Lazaridou,et al. Jointly optimizing word representations for lexical and sentential tasks with the C-PHRASE model , 2015, ACL.

[26] Omer Levy,et al. Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[27] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.

[28] Kevin Gimpel,et al. From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.

[29] Peter Kulchyski. and , 2015 .

[30] M. de Rijke,et al. Siamese CBOW: Optimizing Word Embeddings for Sentence Representations , 2016, ACL.

[31] Kevin Gimpel,et al. Charagram: Embedding Words and Sentences via Character n-grams , 2016, EMNLP.

[32] Anima Anandkumar,et al. Unsupervised Learning of Word-Sequence Representations from Scratch via Convolutional Tensor Decomposition , 2016, ArXiv.

[33] Felix Hill,et al. Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.

[34] Kevin Gimpel,et al. Towards Universal Paraphrastic Sentence Embeddings , 2015, ICLR.

[35] Sanjeev Arora,et al. A Latent Variable Model Approach to PMI-based Word Embeddings , 2015, TACL.

[36] Yoshua Bengio,et al. Learning to Understand Phrases by Embedding the Dictionary , 2015, TACL.

[37] Sanjeev Arora,et al. A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.

[38] Tomas Mikolov,et al. Bag of Tricks for Efficient Text Classification , 2016, EACL.

[39] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.

[40] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.

[41] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.