论文信息 - A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors - 字舞流文

A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors

Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features. This paper introduces a la carte embedding, a simple and general alternative to the usual word2vec-based approaches for building such representations that is based upon recent theoretical results for GloVe-like embeddings. Our method relies mainly on a linear transformation that is efficiently learnable using pretrained word vectors and linear regression. This transform is applicable on the fly in the future when a new text feature or rare word is encountered, even if only a single usage example is available. We introduce a new dataset showing how the a la carte method requires fewer examples of words in context to learn high-quality embeddings and we obtain state-of-the-art results on a nonce task and some unsupervised document classification tasks.

Mikhail Khodak | Sanjeev Arora | Yingyu Liang | Tengyu Ma | Nikunj Saunshi | Brandon Stewart | Sanjeev Arora | Tengyu Ma | M. Khodak | Yingyu Liang | Brandon M Stewart | Nikunj Saunshi

[1] Zellig S. Harris,et al. Distributional Structure , 1954 .

[2] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[3] Dan Roth,et al. Learning Question Classifiers , 2002, COLING.

[4] Bo Pang,et al. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[5] George A. Miller,et al. Annotating WordNet , 2004, FCP@NAACL-HLT.

[6] Bing Liu,et al. Mining and summarizing customer reviews , 2004, KDD.

[7] Claire Cardie,et al. Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[8] Bo Pang,et al. Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[9] Mirella Lapata,et al. Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[10] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[11] Christopher D. Manning,et al. Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[12] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[13] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[14] Roberto Navigli,et al. SemEval-2013 Task 12: Multilingual Word Sense Disambiguation , 2013, *SEMEVAL.

[15] Manaal Faruqui,et al. Community Evaluation and Exchange of Word Vectors at wordvectors.org , 2014, ACL.

[16] Wenpeng Yin,et al. An Exploration of Embeddings for Generalized Phrases , 2014, ACL.

[17] Danushka Bollegala,et al. Learning to Predict Distributions of Words Across Domains , 2014, ACL.

[18] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[19] Hinrich Schütze,et al. AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[20] Jure Leskovec,et al. Inferring Networks of Substitutable and Complementary Products , 2015, KDD.

[21] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.

[22] Ignacio Iacobacci,et al. SensEmbed: Learning Sense Embeddings for Word and Relational Similarity , 2015, ACL.

[23] Roberto Navigli,et al. SemEval-2015 Task 13: Multilingual All-Words Sense Disambiguation and Entity Linking , 2015, *SEMEVAL.

[24] Roberto Navigli,et al. Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities , 2016, Artif. Intell..

[25] Felix Hill,et al. Learning Distributed Representations of Sentences from Unlabelled Data , 2016, NAACL.

[26] Kevin Gimpel,et al. Towards Universal Paraphrastic Sentence Embeddings , 2015, ICLR.

[27] Jan Snajder,et al. Predictability of Distributional Semantics in Derivational Word Formation , 2016, COLING.

[28] Yoav Goldberg,et al. A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[29] Sanjeev Arora,et al. A Latent Variable Model Approach to PMI-based Word Embeddings , 2015, TACL.

[30] Hiroyuki Shindo,et al. Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation , 2016, CoNLL.

[31] Christiane Fellbaum,et al. Automated WordNet Construction Using Word Embeddings , 2017 .

[32] Sanjeev Arora,et al. A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.

[33] Zhe Gan,et al. Learning Generic Sentence Representations Using Convolutional Neural Networks , 2016, EMNLP.

[34] Ilya Sutskever,et al. Learning to Generate Reviews and Discovering Sentiment , 2017, ArXiv.

[35] Benjamin Van Durme,et al. Efficient, Compositional, Order-sensitive n-gram Embeddings , 2017, EACL.

[36] Angeliki Lazaridou,et al. Multimodal Word Meaning Induction From Minimal Exposure to Natural Text. , 2017, Cognitive science.

[37] Graham Neubig,et al. Cross-Lingual Word Embeddings for Low-Resource Language Modeling , 2017, EACL.

[38] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.

[39] Marco Baroni,et al. High-risk learning: acquiring new word vectors from tiny data , 2017, EMNLP.

[40] Kevin Duh,et al. A Multi-task Learning Approach to Adapting Bilingual Word Embeddings for Cross-lingual Named Entity Recognition , 2017, IJCNLP.

[41] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.

[42] Roberto Navigli,et al. Neural Sequence Learning Models for Word Sense Disambiguation , 2017, EMNLP.

[43] Sanjeev Arora,et al. Linear Algebraic Structure of Word Senses, with Applications to Polysemy , 2016, TACL.

[44] Matteo Pagliardini,et al. Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features , 2017, NAACL.

[45] Carlos Gómez-Rodríguez,et al. Global Transition-based Non-projective Dependency Parsing , 2018, ACL.

[46] Mikhail Khodak,et al. A Compressed Sensing View of Unsupervised Text Embeddings, Bag-of-n-Grams, and LSTMs , 2018, ICLR.

[47] Jason Weston,et al. StarSpace: Embed All The Things! , 2017, AAAI.

[48] Pramod Viswanath,et al. All-but-the-Top: Simple and Effective Postprocessing for Word Representations , 2017, ICLR.

[49] Honglak Lee,et al. An efficient framework for learning sentence representations , 2018, ICLR.