Reasoning about Meaning in Natural Language with Compact Closed Categories and Frobenius Algebras

Compact closed categories have found applications in modeling quantum information protocols by Abramsky-Coecke. They also provide semantics for Lambek's pregroup algebras, applied to formalizing the grammatical structure of natural language, and are implicit in a distributional model of word meaning based on vector spaces. Specifically, in previous work Coecke-Clark-Sadrzadeh used the product category of pregroups with vector spaces and provided a distributional model of meaning for sentences. We recast this theory in terms of strongly monoidal functors and advance it via Frobenius algebras over vector spaces. The former are used to formalize topological quantum field theories by Atiyah and Baez-Dolan, and the latter are used to model classical data in quantum protocols by Coecke-Pavlovic-Vicary. The Frobenius algebras enable us to work in a single space in which meanings of words, phrases, and sentences of any structure live. Hence we can compare meanings of different language constructs and enhance the applicability of the theory. We report on experimental results on a number of language tasks and verify the theoretical predictions.

[1]  G. M. Kelly,et al.  Coherence for compact closed categories , 1980 .

[2]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[3]  Stephen Clark,et al.  Mathematical Foundations for a Compositional Distributional Model of Meaning , 2010, ArXiv.

[4]  Richard Montague,et al.  ENGLISH AS A FORMAL LANGUAGE , 1975 .

[5]  Michael Moortgat,et al.  Symmetric Categorial Grammar , 2009, J. Philos. Log..

[6]  J. R. Firth,et al.  Studies in Linguistic Analysis. , 1974 .

[7]  Dusko Pavlovic,et al.  A new description of orthogonal bases , 2008, Mathematical Structures in Computer Science.

[8]  J. Neumann,et al.  The Logic of Quantum Mechanics , 1936 .

[9]  G. M. Kelly Many-variable functorial calculus. I. , 1972 .

[10]  A. Carboni,et al.  Cartesian bicategories I , 1987 .

[11]  David N. Yetter,et al.  FROBENIUS ALGEBRAS AND 2D TOPOLOGICAL QUANTUM FIELD THEORIES (London Mathematical Society Student Texts 59) , 2004 .

[12]  Anne Preller,et al.  Bell States and Negative Sentences in the Distributed Model of Meaning , 2011, Electron. Notes Theor. Comput. Sci..

[13]  Glyn Morrill,et al.  Discontinuity in categorial grammar , 1995 .

[14]  Philippe de Groote,et al.  Towards Abstract Categorial Grammars , 2001, ACL.

[15]  Mehrnoosh Sadrzadeh,et al.  Lambek vs. Lambek: Functorial vector space semantics and string diagrams for Lambek calculus , 2013, Ann. Pure Appl. Log..

[16]  P. Selinger A Survey of Graphical Languages for Monoidal Categories , 2009, 0908.3347.

[17]  Michael Atiyah,et al.  Topological quantum field theories , 1988 .

[18]  François Lamarche,et al.  Classical Non-Associative Lambek Calculus , 2002, Stud Logica.

[19]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[20]  A. Kock Strong functors and monoidal monads , 1972 .

[21]  James Richard Curran,et al.  From distributional to semantic similarity , 2004 .

[22]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[23]  Katrin Erk,et al.  A Structured Vector Space Model for Word Meaning in Context , 2008, EMNLP.

[24]  M. Sadrzadeh,et al.  Concrete Compositional Sentence Spaces 1 , 2010 .

[25]  Anne Preller,et al.  Free compact 2-categories , 2007, Mathematical Structures in Computer Science.

[26]  B. L. Waerden Theorie der hyperkomplexen Größen , 1931 .

[27]  J. Lambek The Mathematics of Sentence Structure , 1958 .

[28]  Samson Abramsky,et al.  A categorical semantics of quantum protocols , 2004, Proceedings of the 19th Annual IEEE Symposium on Logic in Computer Science, 2004..

[29]  Stephen Clark,et al.  Combining Symbolic and Distributional Models of Meaning , 2007, AAAI Spring Symposium: Quantum Interaction.

[30]  Joachim Kock,et al.  Frobenius Algebras and 2-D Topological Quantum Field Theories , 2004 .

[31]  J. Baez,et al.  Higher dimensional algebra and topological quantum field theory , 1995, q-alg/9503002.

[32]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[33]  Mehrnoosh Sadrzadeh,et al.  Experimenting with transitive verbs in a DisCoCat , 2011, GEMS.

[34]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[35]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[36]  A. Joyal,et al.  The geometry of tensor calculus, I , 1991 .

[37]  Marco Baroni,et al.  Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.