Representation Learning for Type-Driven Composition

This paper is about learning word representations using grammatical type information. We use the syntactic types of Combinatory Categorial Grammar to develop multilinear representations, i.e. maps with n arguments, for words with different functional types. The multilinear maps of words compose with each other to form sentence representations. We extend the skipgram algorithm from vectors to multi- linear maps to learn these representations and instantiate it on unary and binary maps for transitive verbs. These are evaluated on verb and sentence similarity and disambiguation tasks and a subset of the SICK relatedness dataset. Our model performs better than previous type- driven models and is competitive with state of the art representation learning methods such as BERT and neural sentence encoders.

[1]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[2]  Stephen Clark,et al.  Mathematical Foundations for a Compositional Distributional Model of Meaning , 2010, ArXiv.

[3]  Chu-Ren Huang,et al.  Representing Verbs with Rich Contexts: an Evaluation on Verb Similarity , 2016, EMNLP.

[4]  Dimitri Kartsaklis,et al.  Evaluating Neural Word Representations in Tensor-Based Compositional Settings , 2014, EMNLP.

[5]  William C. Frederick,et al.  A Combinatory Logic , 1995 .

[6]  Marco Baroni,et al.  Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.

[7]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[8]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[9]  Gemma Boleda,et al.  Distributional Semantics in Technicolor , 2012, ACL.

[10]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[11]  Marco Baroni,et al.  Frege in Space: A Program for Composition Distributional Semantics , 2014, LILT.

[12]  Marco Marelli,et al.  A SICK cure for the evaluation of compositional distributional semantic models , 2014, LREC.

[13]  Marco Baroni,et al.  A practical and linguistically-motivated approach to compositional distributional semantics , 2014, ACL.

[14]  Stephen Clark,et al.  Learning Adjective Meanings with a Tensor-Based Skip-Gram Model , 2015, CoNLL.

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[18]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[19]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[20]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[21]  Dimitri Kartsaklis,et al.  Separating Disambiguation from Composition in Distributional Semantics , 2013, CoNLL.

[22]  Stephen Clark,et al.  A Type-Driven Tensor-Based Semantics for CCG , 2014, EACL 2014.

[23]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[24]  Mehrnoosh Sadrzadeh,et al.  Multi-Step Regression Learning for Compositional Distributional Semantics , 2013, IWCS.

[25]  Samuel R. Bowman,et al.  The Lifted Matrix-Space Model for Semantic Composition , 2018, CoNLL.

[26]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[27]  David M. W. Powers,et al.  Verb similarity on the taxonomy of WordNet , 2006 .

[28]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[29]  Felix Hill,et al.  SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity , 2016, EMNLP.

[30]  Dimitri Kartsaklis,et al.  Prior Disambiguation of Word Tensors for Constructing Sentence Vectors , 2013, EMNLP.

[31]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[32]  Mehrnoosh Sadrzadeh,et al.  Evaluating Composition Models for Verb Phrase Elliptical Sentence Embeddings , 2019, NAACL.

[33]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[34]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[35]  Chris Dyer,et al.  Syntactic Structure Distillation Pretraining for Bidirectional Encoders , 2020, Transactions of the Association for Computational Linguistics.

[36]  Mehrnoosh Sadrzadeh,et al.  Lambek vs. Lambek: Functorial vector space semantics and string diagrams for Lambek calculus , 2013, Ann. Pure Appl. Log..

[37]  Stephen Clark,et al.  Using Sentence Plausibility to Learn the Semantics of Transitive Verbs , 2014, ArXiv.

[38]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.