Task-Oriented Learning of Word Embeddings for Semantic Relation Classification

We present a novel learning method for word embeddings designed for relation classification. Our word embeddings are trained by predicting words between noun pairs using lexical relation-specific features on a large unlabeled corpus. This allows us to explicitly incorporate relation-specific information into the word embeddings. The learned word embeddings are then used to construct feature vectors for a relation classification model. On a well-established semantic relation classification task, our method significantly outperforms a baseline based on a previously introduced word embedding method, and compares favorably to previous state-of-the-art models that use syntactic information or manually constructed external resources.

[1]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[2]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[3]  Scott M. Smith,et al.  Computer Intensive Methods for Testing Hypotheses: An Introduction , 1989 .

[4]  Ralph Grishman,et al.  Employing Word Representations and Regularization for Domain Adaptation of Relation Extraction , 2014, ACL.

[5]  K. J. Evans,et al.  Computer Intensive Methods for Testing Hypotheses: An Introduction , 1990 .

[6]  Takashi Chikayama,et al.  Simple Customization of Recursive Neural Networks for Semantic Relation Classification , 2013, EMNLP.

[7]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[8]  Preslav Nakov,et al.  SemEval-2007 Task 04: Classification of Semantic Relations between Nominals , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[9]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[10]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[11]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[12]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[13]  Marco Baroni,et al.  Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.

[14]  Jun'ichi Tsujii,et al.  Feature Forest Models for Probabilistic HPSG Parsing , 2008, CL.

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Yue Zhang,et al.  Feature Embedding for Dependency Parsing , 2014, COLING.

[17]  Preslav Nakov,et al.  SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals , 2009, SEW@NAACL-HLT.

[18]  Wanxiang Che,et al.  Revisiting Embedding Features for Simple Semi-supervised Learning , 2014, EMNLP.

[19]  Sanda M. Harabagiu,et al.  UTD: Classifying Semantic Relations by Combining Lexical and Semantic Resources , 2010, *SEMEVAL.

[20]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[21]  Yasemin Altun,et al.  Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger , 2006, EMNLP.

[22]  Bowen Zhou,et al.  Classifying Relations by Ranking with Convolutional Neural Networks , 2015, ACL.

[23]  Dimitri Kartsaklis,et al.  Prior Disambiguation of Word Tensors for Constructing Sentence Vectors , 2013, EMNLP.

[24]  Jian Su,et al.  A Composite Kernel to Extract Relations between Entities with Both Flat and Structured Features , 2006, ACL.

[25]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[26]  Romaric Besançon,et al.  Event Role Extraction using Domain-Relevant Word Representations , 2014, EMNLP.

[27]  Claire Cardie,et al.  Deep Recursive Neural Networks for Compositionality in Language , 2014, NIPS.

[28]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[29]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[30]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[31]  Christopher D. Manning,et al.  Global Belief Recursive Neural Networks , 2014, NIPS.

[32]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[33]  Kevin Gimpel,et al.  Tailoring Continuous Word Representations for Dependency Parsing , 2014, ACL.

[34]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[35]  Yoshimasa Tsuruoka,et al.  Jointly Learning Word Representations and Composition Functions Using Predicate-Argument Structures , 2014, EMNLP.

[36]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[37]  Dejing Dou,et al.  Chain Based RNN for Relation Classification , 2015, NAACL.

[38]  Mo Yu Factor-based Compositional Embedding Models , 2014 .