Graded Entailment for Compositional Distributional Semantics

The categorical compositional distributional model of natural language provides a conceptually motivated procedure to compute the meaning of sentences, given grammatical structure and the meanings of its words. This approach has outperformed other models in mainstream empirical language processing tasks. However, until recently it has lacked the crucial feature of lexical entailment -- as do other distributional models of meaning. In this paper we solve the problem of entailment for categorical compositional distributional semantics. Taking advantage of the abstract categorical framework allows us to vary our choice of model. This enables the introduction of a notion of entailment, exploiting ideas from the categorical semantics of partial knowledge in quantum computation. The new model of language uses density matrices, on which we introduce a novel robust graded order capturing the entailment strength between concepts. This graded measure emerges from a general framework for approximate entailment, induced by any commutative monoid. Quantum logic embeds in our graded order. Our main theorem shows that entailment strength lifts compositionally to the sentence level, giving a lower bound on sentence entailment. We describe the essential properties of graded entailment such as continuity, and provide a procedure for calculating entailment strength.

[1]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[2]  Elham Kashefi,et al.  A Quantum-Theoretic Approach to Distributional Semantics , 2013, NAACL.

[3]  Anne Preller,et al.  Bell States and Negative Sentences in the Distributed Model of Meaning , 2011, Electron. Notes Theor. Comput. Sci..

[4]  B. Coecke,et al.  Categories for the practising physicist , 2009, 0905.3010.

[5]  J Quinonero Candela,et al.  Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment , 2006, Lecture Notes in Computer Science.

[6]  W. W. Hansen,et al.  Nuclear Induction , 2011 .

[7]  Mehrnoosh Sadrzadeh,et al.  Lambek vs. Lambek: Functorial vector space semantics and string diagrams for Lambek calculus , 2013, Ann. Pure Appl. Log..

[8]  Dimitri Kartsaklis,et al.  A Unified Sentence Space for Categorical Distributional-Compositional Semantics: Theory and Experiments , 2012, COLING.

[9]  Stephen Clark,et al.  The Frobenius anatomy of word meanings I: subject and object relative pronouns , 2013, J. Log. Comput..

[10]  J. Hampton Inheritance of attributes in natural concept conjunctions , 1987, Memory & cognition.

[11]  J. Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: A computational study , 2007, Behavior research methods.

[12]  Dimitri Kartsaklis,et al.  Sentence entailment in compositional distributional semantics , 2015, Annals of Mathematics and Artificial Intelligence.

[13]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[14]  Dimitri Kartsaklis,et al.  Open System Categorical Quantum Semantics in Natural Language Processing , 2015, CALCO.

[15]  Stephen Clark,et al.  Exploiting Image Generality for Lexical Entailment Detection , 2015, ACL.

[16]  Arka Bandyopadhyay A partial order on classical and quantum states , 2011 .

[17]  Esma Balkr,et al.  Using Density Matrices in a Compositional Distributional Model of Meaning , 2014 .

[18]  Karl Löwner Über monotone Matrixfunktionen , 1934 .

[19]  David P Vinson,et al.  Semantic feature production norms for a large set of objects and events , 2008, Behavior research methods.

[20]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[21]  Dexter Kozen,et al.  A probabilistic PDL , 1983, J. Comput. Syst. Sci..

[22]  Daoud Clarke Context-theoretic Semantics for Natural Language: an Overview , 2009 .

[23]  J. Neumann,et al.  The Logic of Quantum Mechanics , 1936 .

[24]  Laura Rimell,et al.  Distributional Lexical Entailment by Topic Coherence , 2014, EACL.

[25]  Julie Weeds,et al.  Finding Predominant Word Senses in Untagged Text , 2004, ACL.

[26]  C. J. van Rijsbergen,et al.  The geometry of information retrieval , 2004 .

[27]  Raffaella Bernardi,et al.  Entailment above the word level in distributional semantics , 2012, EACL.

[28]  Stephen Pulman Compositional distributional semantics with compact closed categories and Frobenius algebras , 2014 .

[29]  Stephen Clark,et al.  Mathematical Foundations for a Compositional Distributional Model of Meaning , 2010, ArXiv.

[30]  Mark S. Seidenberg,et al.  Semantic feature production norms for a large set of living and nonliving things , 2005, Behavior research methods.

[31]  Ido Dagan,et al.  Directional distributional similarity for lexical inference , 2010, Natural Language Engineering.

[32]  Joachim Lambek,et al.  Type Grammar Revisited , 1997, LACL.

[33]  Robin Piedeleu,et al.  Ambiguity in Categorical Models of Meaning , 2014 .

[34]  Jeroen Geertzen,et al.  The Centre for Speech, Language and the Brain (CSLB) concept property norms , 2013, Behavior research methods.

[35]  Prakash Panangaden,et al.  Quantum weakest preconditions , 2005, Mathematical Structures in Computer Science.

[36]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[37]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[38]  G. M. Kelly,et al.  Coherence for compact closed categories , 1980 .

[39]  Alessandro Lenci,et al.  Identifying hypernyms in distributional semantic spaces , 2012, *SEMEVAL.

[40]  Mehrnoosh Sadrzadeh,et al.  Distributional Sentence Entailment Using Density Matrices , 2015, TTCS.

[41]  Christopher D. Manning,et al.  Natural Logic for Textual Inference , 2007, ACL-PASCAL@ACL.

[42]  Ido Dagan,et al.  The Distributional Inclusion Hypotheses and Lexical Entailment , 2005, ACL.

[43]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[44]  M. Nielsen Conditions for a Class of Entanglement Transformations , 1998, quant-ph/9811053.

[45]  S. Peters,et al.  Word Vectors and Quantum Logic Experiments with negation and disjunction , 2003 .

[46]  David J. Weir,et al.  Characterising Measures of Lexical Distributional Similarity , 2004, COLING.

[47]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[48]  H. Weyl Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung) , 1912 .

[49]  Nancy A. Lynch,et al.  Proceedings of the fifteenth annual ACM symposium on Theory of computing , 1983, STOC 1983.

[50]  Thierry Paul,et al.  Quantum computation and quantum information , 2007, Mathematical Structures in Computer Science.

[51]  Peter Selinger,et al.  Dagger Compact Closed Categories and Completely Positive Maps: (Extended Abstract) , 2007, QPL.

[52]  Comparing Meaning in Language and Cognition : P-Hyponymy , Concept Combination , Asymmetric Similarity Candidate , 2016 .

[53]  Michael Ramscar,et al.  Testing the Distributioanl Hypothesis: The influence of Context on Judgements of Semantic Similarity , 2001 .