Translating and Evolving: Towards a Model of Language Change in DisCoCat

The categorical compositional distributional (DisCoCat) model of meaning developed by Coecke et al. (2010) has been successful in modeling various aspects of meaning. However, it fails to model the fact that language can change. We give an approach to DisCoCat that allows us to represent language models and translations between them, enabling us to describe translations from one language to another, or changes within the same language. We unify the product space representation given in (Coecke et al., 2010) and the functorial description in (Kartsaklis et al., 2013), in a way that allows us to view a language as a catalogue of meanings. We formalize the notion of a lexicon in DisCoCat, and define a dictionary of meanings between two lexicons. All this is done within the framework of monoidal categories. We give examples of how to apply our methods, and give a concrete suggestion for compositional translation in corpora.

[1]  Mehrnoosh Sadrzadeh,et al.  Lambek vs. Lambek: Functorial vector space semantics and string diagrams for Lambek calculus , 2013, Ann. Pure Appl. Log..

[2]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[3]  Dimitri Kartsaklis,et al.  A Unified Sentence Space for Categorical Distributional-Compositional Semantics: Theory and Experiments , 2012, COLING.

[4]  Stephen Clark,et al.  Mathematical Foundations for a Compositional Distributional Model of Meaning , 2010, ArXiv.

[5]  Mehrnoosh Sadrzadeh,et al.  A generalised quantifier theory of natural language in categorical compositional distributional semantics with bialgebras , 2016, Mathematical Structures in Computer Science.

[6]  Martha Lewis,et al.  Categorical Compositional Cognition , 2016, QI.

[7]  Dimitri Kartsaklis,et al.  Sentence entailment in compositional distributional semantics , 2015, Annals of Mathematics and Artificial Intelligence.

[8]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[9]  Stephen Clark,et al.  A Type-Driven Tensor-Based Semantics for CCG , 2014, EACL 2014.

[10]  Martha Lewis,et al.  Interacting Conceptual Spaces I : Grammatical Composition of Concepts , 2017, Conceptual Spaces: Elaborations and Applications.

[11]  Dimitri Kartsaklis,et al.  Prior Disambiguation of Word Tensors for Constructing Sentence Vectors , 2013, EMNLP.

[12]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[13]  Remi van Trijp Linguistic Assessment Criteria for Explaining Language Change: A Case Study on Syncretism in German Definite Articles , 2013 .

[14]  Martha Lewis,et al.  Graded Entailment for Compositional Distributional Semantics , 2016, ArXiv.

[15]  Joachim Lambek,et al.  Type Grammars as Pregroups , 2001, Grammars.

[16]  Dimitri Kartsaklis,et al.  Reasoning about Meaning in Natural Language with Compact Closed Categories and Frobenius Algebras , 2014, ArXiv.

[17]  A. Kroch Reflexes of grammar in patterns of language change , 1989, Language Variation and Change.

[18]  Tomas Mikolov,et al.  Improving Supervised Bilingual Mapping of Word Embeddings , 2018, ArXiv.

[19]  Anne Preller,et al.  Free compact 2-categories , 2007, Mathematical Structures in Computer Science.

[20]  Jianfeng Gao,et al.  Learning Continuous Phrase Representations for Translation Modeling , 2014, ACL.

[21]  L. Steels Evolving grounded communication for robots , 2003, Trends in Cognitive Sciences.

[22]  Dimitri Kartsaklis,et al.  Open System Categorical Quantum Semantics in Natural Language Processing , 2015, CALCO.

[23]  Marco Baroni,et al.  Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.

[24]  Anne Preller From Logical to Distributional Models , 2013, QPL.