The framework of Compositional Distributional Semantics unifies vector space models for lexical meanings with a compositional account of how these meanings combine into phrases and larger units. The syntactic engines that have been used to drive the interpretation process (Lambek grammars, pregroups) are problematic in two respects (overgeneration and undergeneration) compromising the accuracy of the quantitative values associated with a derivation. We address these problems by moving to a non-symmetric, non-associative, non-unital type logic with a tree-building tensor operation, generating phrases rather than strings. Composition (tensor) and decomposition of phrases (cotensor) are treated on a par. Reordering and restructuring are controlled by adjoint pairs of modalities, the grammatical analogues of Linear Logic’s ’ !’. We discuss the categorical structures for this model of syntax and the associated graphical language. We identify some empirical areas where the model leads to improved performance. In the field of natural language semantics, the compositional distributional framework of [3] and subsequent work (see [9] for an overview of results obtained so far) has achieved remarkable progress by unifying vector space models for lexical meanings with a compositional account of how these meanings combine into phrases and larger units. Interpretation takes the form of a functorial transition from Form to Meaning: a structure-preserving map that associates the operations for building syntactic structure with vector composition operations, thus assigning quantitative values to these structures. The quality of the quantitative values thus obtained is determined by the accuracy of the syntactic engine driving the interpretation process. Compositional Distributional Semantics has used type logics for that purpose: Lambek’s original Syntactic Calculus (L), and its more recent Pregroup incarnation (PG). Categorically, these are systems with a (non-symmetric) monoidal bi-closed or compact closed structure, respectively. As models of natural language syntax, these calculi are lacking in two respects: overgeneration and undergeneration. Both L and PG model the composition of phrases with an associative multiplicative tensor operation, claiming in fact that no aspect of grammatical organization beyond linear order can a↵ect
[1]
Michael Moortgat,et al.
Continuation semantics for the Lambek-Grishin calculus
,
2010,
Inf. Comput..
[2]
Michael Moortgat,et al.
Symmetric Categorial Grammar
,
2009,
J. Philos. Log..
[3]
Michael Moortgat,et al.
Structural control
,
1997
.
[4]
Dimitri Kartsaklis,et al.
Open System Categorical Quantum Semantics in Natural Language Processing
,
2015,
CALCO.
[5]
G. Wijnholds.
Categorical Foundations for Extended Compositional Distributional Models of Meaning
,
2014
.
[6]
Joachim Lambek,et al.
On the Calculus of Syntactic Types
,
1961
.
[7]
Bart Jacobs,et al.
Semantics of Weakening and Contraction
,
1994,
Ann. Pure Appl. Log..
[8]
Stephen Clark,et al.
Mathematical Foundations for a Compositional Distributional Model of Meaning
,
2010,
ArXiv.
[9]
Laura Kallmeyer,et al.
Parsing Beyond Context-Free Grammars
,
2010,
Cognitive Technologies.