A CCG-Based Version of the DisCoCat Framework

While the DisCoCat model (Coecke et al., 2010) has been proved a valuable tool for studying compositional aspects of language at the level of semantics, its strong dependency on pregroup grammars poses important restrictions: first, it prevents large-scale experimentation due to the absence of a pregroup parser; and second, it limits the expressibility of the model to context-free grammars. In this paper we solve these problems by reformulating DisCoCat as a passage from Combinatory Categorial Grammar (CCG) to a category of semantics. We start by showing that standard categorial grammars can be expressed as a biclosed category, where all rules emerge as currying/uncurrying the identity; we then proceed to model permutation-inducing rules by exploiting the symmetry of the compact closed category encoding the word meaning. We provide a proof of concept for our method, converting “Alice in Wonderland” into DisCoCat form, a corpus that we make available to the community.

[1]  Martha Lewis Modelling hyponymy for DisCoCat , 2019 .

[2]  Stanley Peters,et al.  Cross-Serial Dependencies in Dutch , 1982 .

[3]  Jason Baldridge,et al.  Lexically specified derivational control in combinatory categorial grammar , 2002 .

[4]  Samson Abramsky,et al.  A categorical semantics of quantum protocols , 2004, Proceedings of the 19th Annual IEEE Symposium on Logic in Computer Science, 2004..

[5]  Dimitri Kartsaklis,et al.  Open System Categorical Quantum Semantics in Natural Language Processing , 2015, CALCO.

[6]  Daniel J. Dougherty Closed Categories and Categorial Grammar , 1992, Notre Dame J. Formal Log..

[7]  Martin Kay,et al.  Syntactic Process , 1979, ACL.

[8]  J. Lambek,et al.  Categorial and Categorical Grammars , 1988 .

[9]  Bob Coecke,et al.  DisCoPy: Monoidal Categories in Python , 2021, Electronic Proceedings in Theoretical Computer Science.

[10]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[11]  Dimitri Kartsaklis,et al.  A Unified Sentence Space for Categorical Distributional-Compositional Semantics: Theory and Experiments , 2012, COLING.

[12]  Mark Steedman,et al.  Combinatory grammars and parasitic gaps , 1987 .

[13]  Stephen Clark,et al.  Mathematical Foundations for a Compositional Distributional Model of Meaning , 2010, ArXiv.

[14]  Stephen Clark,et al.  A Type-Driven Tensor-Based Semantics for CCG , 2014, EACL 2014.

[15]  David J. Weir,et al.  The equivalence of four extensions of context-free grammars , 1994, Mathematical systems theory.

[16]  Dimitri Kartsaklis,et al.  QNLP in Practice: Running Compositional Models of Meaning on a Quantum Computer , 2021, ArXiv.

[17]  Y. Bar-Hillel A Quasi-Arithmetical Notation for Syntactic Description , 1953 .

[18]  Wojciech Buszkowski,et al.  Lambek Grammars Based on Pregroups , 2001, LACL.

[19]  Giorgio Satta,et al.  Lexicalization and Generative Power in CCG , 2015, CL.

[20]  Mehrnoosh Sadrzadeh,et al.  Lambek vs. Lambek: Functorial vector space semantics and string diagrams for Lambek calculus , 2013, Ann. Pure Appl. Log..

[21]  Martha Lewis,et al.  Graded Entailment for Compositional Distributional Semantics , 2016, ArXiv.

[22]  Edward Grefenstette,et al.  Category-theoretic quantitative compositional distributional models of natural language semantics , 2013, ArXiv.

[23]  Dimitri Kartsaklis,et al.  Reasoning about Meaning in Natural Language with Compact Closed Categories and Frobenius Algebras , 2014, ArXiv.

[24]  Yuji Matsumoto,et al.  A* CCG Parsing with a Supertag and Dependency Factored Model , 2017, ACL.

[25]  Bob Coecke,et al.  Grammar-Aware Question-Answering on Quantum Computers , 2020, ArXiv.