Functional Distributional Semantics

Vector space models have become popular in distributional semantics, despite the challenges they face in capturing various semantic phenomena. We propose a novel probabilistic framework which draws on both formal semantics and recent advances in machine learning. In particular, we separate predicates from the entities they refer to, allowing us to perform Bayesian inference based on logical forms. We describe an implementation of this framework using a combination of Restricted Boltzmann Machines and feedforward neural networks. Finally, we demonstrate the feasibility of this approach by training it on a parsed corpus and evaluating it on established similarity datasets.

[1]  Paul Buitelaar,et al.  SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2) , 2016, *SEMEVAL.

[2]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[3]  M. Dummett What Is a Theory of Meaning? (II) , 1996 .

[4]  Samuel J. Gershman,et al.  A Tutorial on Bayesian Nonparametric Models , 2011, 1106.2697.

[5]  David L. Davidson,et al.  The Logical Form of Action Sentences , 2001 .

[6]  N. Foo Conceptual Spaces—The Geometry of Thought , 2022 .

[7]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[8]  Hugo Jair Escalante,et al.  The segmented and annotated IAPR TC-12 benchmark , 2010, Comput. Vis. Image Underst..

[9]  Staffan Larsson,et al.  Vagueness and Learning: A Type-Theoretic Approach , 2014, *SEMEVAL.

[10]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[11]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[12]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[13]  Stephen Clark,et al.  Vector Space Models of Lexical Meaning , 2015 .

[14]  Peter Young,et al.  From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.

[15]  R. Nickerson,et al.  Long-term memory for a common object , 1979, Cognitive Psychology.

[16]  Peter Sutton Vagueness, communication, and semantic information , 2013 .

[17]  Ulrich Callmeier,et al.  Efficient Parsing with Large-Scale Unification Grammars , 2001 .

[18]  Ted Briscoe,et al.  Looking for Hyponyms in Vector Space , 2014, CoNLL.

[19]  Stephen Clark,et al.  Vision and Feature Norms: Improving automatic feature norm learning through cross-modal maps , 2016, NAACL.

[20]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[21]  Rebecca Lawson,et al.  The science of cycology: Failures to understand how everyday objects work , 2006, Memory & cognition.

[22]  Dan Flickinger,et al.  Minimal Recursion Semantics: An Introduction , 2005 .

[23]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[24]  Felix Hill,et al.  SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity , 2016, EMNLP.

[25]  L. Steels The symbol grounding problem has been solved, so what’s next? , 2008 .

[26]  Dimitri Kartsaklis,et al.  Open System Categorical Quantum Semantics in Natural Language Processing , 2015, CALCO.

[27]  Katrin Erk,et al.  Exemplar-Based Models for Word Meaning in Context , 2010, ACL.

[28]  Mariarosaria Taddeo,et al.  A Praxical Solution of the Symbol Grounding Problem , 2007, Minds and Machines.

[29]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[30]  Alan D Castel,et al.  Rapid Communication: The Apple of the mind's eye: Everyday attention, metamemory, and reconstructive memory for the Apple logo , 2015, Quarterly journal of experimental psychology.

[31]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[32]  Jones E E Constance,et al.  A New Law of Thought and its Logical Bearings , 1911, Nature.

[33]  Timothy Baldwin,et al.  Unsupervised Estimation of Word Usage Similarity , 2012, ALTA.

[34]  Stephan Oepen,et al.  Stochastic HPSG Parse Selection using the Redwoods Corpus , 2005 .

[35]  Katrin Erk,et al.  Representing words as regions in vector space , 2009, CoNLL.

[36]  N. Cocchiarella,et al.  Situations and Attitudes. , 1986 .

[37]  Mariarosaria Taddeo,et al.  Solving the symbol grounding problem: a critical review of fifteen years of research , 2005, J. Exp. Theor. Artif. Intell..

[38]  Esma Balkr,et al.  Using Density Matrices in a Compositional Distributional Model of Meaning , 2014 .

[39]  Annette Herskovits Language and Spatial Cognition: An Interdisciplinary Study of the Prepositions in English , 2009 .

[40]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[41]  Michael C. Frank,et al.  Review Pragmatic Language Interpretation as Probabilistic Inference , 2022 .

[42]  Stephan Oepen,et al.  WikiWoods: Syntacto-Semantic Annotation for English Wikipedia , 2010, LREC.

[43]  G. Murphy,et al.  The Big Book of Concepts , 2002 .

[44]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[45]  ová,et al.  Dependency Treebank : From analytic to tectogrammatical annotations , 2000 .

[46]  Peter R. Sutton Probabilistic Approaches to Vagueness and Semantic Competency , 2018 .

[47]  Omer Levy,et al.  Annotation Artifacts in Natural Language Inference Data , 2018, NAACL.

[48]  Noah D. Goodman,et al.  Probabilistic Semantics and Pragmatics: Uncertainty in Language and Thought , 2015 .

[49]  Anna Korhonen,et al.  Probabilistic Distributional Semantics with Latent Variable Models , 2014, CL.

[50]  David Kaplan,et al.  On the logic of demonstratives , 1979, J. Philos. Log..

[51]  Michael Dummett What Do I Know When I Know a Language , 1996 .

[52]  Brian McMahan,et al.  A Bayesian Model of Grounded Color Semantics , 2015, TACL.

[53]  Emily M. Bender,et al.  Layers of Interpretation: On Grammar and Compositionality , 2015, IWCS.

[54]  Andrew Gordon Wilson,et al.  Multimodal Word Distributions , 2017, ACL.

[55]  Godehard Link The Logical Analysis of Plurals and Mass Terms: A Lattice‐theoretical Approach , 2008 .

[56]  Katrin Erk,et al.  What do you know about an alligator when you know the company it keeps , 2016 .

[57]  Phil Blunsom,et al.  Robust Incremental Neural Semantic Graph Parsing , 2017, ACL.

[58]  Marek Rei,et al.  Minimally supervised dependency-based methods for natural language processing , 2013 .

[59]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[60]  Reinhard Blutner,et al.  Lexical Pragmatics , 1998, J. Semant..

[61]  R. Hursthouse THE LOGIC OF DECISION AND ACTION , 1969 .

[62]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[63]  Robin Cooper,et al.  Austinian Truth, Attitudes and Type Theory , 2005 .

[64]  Katrin Erk,et al.  Investigations on Word Senses and Word Usages , 2009, ACL.

[65]  Petr Hájek,et al.  Fuzzy logic and probability , 1995, UAI.

[66]  Karen Spärck Jones Computational Linguistics: What About the Linguistics? , 2007, Computational Linguistics.

[67]  Mark Steedman,et al.  Combinatory Categorial Grammar , 2011 .

[68]  Stephen Clark,et al.  Learning Neural Audio Embeddings for Grounding Semantics in Auditory Perception , 2017, J. Artif. Intell. Res..

[69]  M. Engelmann The Philosophical Investigations , 2013 .

[70]  Aurélie Herbelot,et al.  Composing distributions : mathematical structures and their linguistic interpretation , 2013 .

[71]  Aurélie Herbelot,et al.  Building a shared world: mapping distributional to model-theoretic semantic spaces , 2015, EMNLP.

[72]  Stefan Thater,et al.  The Evolution of Dominance Constraint Solvers , 2005, ACL 2005.

[73]  Diarmuid Ó Séaghdha Latent Variable Models of Selectional Preference , 2010, ACL.

[74]  Judea Pearl,et al.  On logic and probability , 1988, Comput. Intell..

[75]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[76]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[77]  Karin Ackermann,et al.  Categories and Concepts , 2003, Job 28. Cognition in Context.

[78]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[79]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[80]  Stephan Oepen,et al.  Extracting and Annotating Wikipedia Sub-Domains — Towards a New eScience Community Resource , 2008 .

[81]  Oren Barkan,et al.  Bayesian Neural Word Embedding , 2016, AAAI.

[82]  Claudia Maienborn,et al.  On the limits of the Davidsonian approach: The case of copula sentences. , 2005 .

[83]  Stefan Thater,et al.  An Improved Redundancy Elimination Algorithm for Underspecified Representations , 2006, ACL.

[84]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[85]  Sanghoun Song,et al.  Modeling Information Structure in a Cross-Linguistic Perspective , 2017 .

[86]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[87]  Gerard Salton,et al.  Mathematics and Information Retrieval , 1979, J. Documentation.

[88]  Michael Wayne Goodman,et al.  Resources for building applications with Dependency Minimal Recursion Semantics , 2016, LREC.

[89]  Johan Bos,et al.  Combining Lexical and Spatial Knowledge to Predict Spatial Relations between Objects in Images , 2016, VL@ACL.

[90]  Andrew Gordon Wilson,et al.  Hierarchical Density Order Embeddings , 2018, ICLR.

[91]  Ann Copestake,et al.  Lacking integrity: HPSG as a morphosyntactic theory , 2015, Proceedings of the International Conference on Head-Driven Phrase Structure Grammar.

[92]  David J. Weir,et al.  Characterising Measures of Lexical Distributional Similarity , 2004, COLING.

[93]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  B. Saunders,et al.  Are there nontrivial constraints on colour categorization? , 1997, Behavioral and Brain Sciences.

[95]  Yansong Feng,et al.  Visual Information in Semantic Representation , 2010, NAACL.

[96]  Johan van Benthem,et al.  Questions About Quantifiers , 1984, J. Symb. Log..

[97]  Stephen Clark,et al.  The Frobenius anatomy of word meanings I: subject and object relative pronouns , 2013, J. Log. Comput..

[98]  C. Osgood The nature and measurement of meaning. , 1952, Psychological bulletin.

[99]  Elia Bruni,et al.  Distributional semantics from text and images , 2011, GEMS.

[100]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[101]  Anthony Kenny CONCEPTS, BRAINS, AND BEHAVIOUR , 2010 .

[102]  Staffan Larsson,et al.  Probabilistic Type Theory and Natural Language Semantics , 2015, LILT.

[103]  Suresh Manandhar,et al.  An Empirical Study on Compositionality in Compound Nouns , 2011, IJCNLP.

[104]  Tijmen Tieleman,et al.  Training restricted Boltzmann machines using approximations to the likelihood gradient , 2008, ICML '08.

[105]  David Schlangen,et al.  Is this a Child, a Girl or a Car? Exploring the Contribution of Distributional Similarity to Learning Referential Word Meanings , 2017, EACL.

[106]  Louise McNally,et al.  Conceptual versus referential affordance in concept composition , 2017 .

[107]  Ryan Cotterell,et al.  Explaining and Generalizing Skip-Gram through Exponential Family Principal Component Analysis , 2017, EACL.

[108]  Michael C. Frank,et al.  Predicting Pragmatic Reasoning in Language Games , 2012, Science.

[109]  Katrin Erk,et al.  Representing Meaning with a Combination of Logical and Distributional Models , 2015, CL.

[110]  L. Barsalou,et al.  Moving beyond the distinction between concrete and abstract concepts , 2018, Philosophical Transactions of the Royal Society B: Biological Sciences.

[111]  Katrin Erk,et al.  Distributional model on a diet: One-shot word learning from text only , 2017, IJCNLP 2017.

[112]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[113]  M. McCloskey,et al.  Natural categories: Well defined or fuzzy sets? , 1978 .

[114]  David Schlangen,et al.  Obtaining referential word meanings from visual and distributional information: Experiments on object naming , 2017, ACL.

[115]  M. McCloskey,et al.  The Devil’s in the g-Tails: Deficient Letter-Shape Knowledge and Awareness Despite Massive Visual Experience , 2018, Journal of experimental psychology. Human perception and performance.

[116]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[117]  F. Pulvermüller How neurons make meaning: brain mechanisms for embodied and abstract-symbolic semantics , 2013, Trends in Cognitive Sciences.

[118]  Magnus Sahlgren,et al.  The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces , 2006 .

[119]  Mehrnoosh Sadrzadeh,et al.  Distributional Sentence Entailment Using Density Matrices , 2015, TTCS.

[120]  Stephen Clark,et al.  Mathematical Foundations for a Compositional Distributional Model of Meaning , 2010, ArXiv.

[121]  David Kaplan Demonstratives: An Essay on the Semantics, Logic, Metaphysics and Epistemology of Demonstratives and other Indexicals , 1989 .

[122]  Omer Levy,et al.  Do Supervised Distributional Methods Really Learn Lexical Inference Relations? , 2015, NAACL.

[123]  Noah D. Goodman,et al.  A pragmatic theory of generic language , 2016, ArXiv.

[124]  Ido Dagan,et al.  The Distributional Inclusion Hypotheses and Lexical Entailment , 2005, ACL.

[125]  A. Tarski,et al.  Arithmetical extensions of relational systems , 1958 .

[126]  Dimitri Kartsaklis,et al.  Sentence entailment in compositional distributional semantics , 2015, Annals of Mathematics and Artificial Intelligence.

[127]  R. Brandom,et al.  Articulating Reasons: An Introduction to Inferentialism , 2002 .

[128]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[129]  Stephen Clark,et al.  A Type-Driven Tensor-Based Semantics for CCG , 2014, EACL 2014.

[130]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[131]  Mirella Lapata,et al.  A Comparison of Vector-based Representations for Semantic Composition , 2012, EMNLP.

[132]  Emily M. Bender,et al.  Towards an Encyclopedia of Compositional Semantics: Documenting the Interface of the English Resource Grammar , 2014, LREC.

[133]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[134]  P. Gärdenfors The Geometry of Meaning: Semantics Based on Conceptual Spaces , 2014 .

[135]  Reinhard Rapp A Practical Solution to the Problem of Automatic Word Sense Induction , 2004, ACL.

[136]  Carina Silberer,et al.  Learning Grounded Meaning Representations with Autoencoders , 2014, ACL.

[137]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[138]  T. Parsons Some problems concerning the logic of grammatical modifiers , 1970, Synthese.

[139]  Ryan P. Adams,et al.  Cardinality Restricted Boltzmann Machines , 2012, NIPS.

[140]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[141]  J. Barwise,et al.  The Liar: An Essay on Truth and Circularity , 1987 .

[142]  Lars Jørgen Solberg A Corpus Builder for Wikipedia , 2012 .

[143]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[144]  Alessandro Lenci,et al.  Distributional semantics in linguistic and cognitive research , 2008 .

[145]  D. Flickinger Accuracy vs. Robustness in Grammar Engineering , 2010 .

[146]  Ann A. Copestake,et al.  Invited Talk: Slacker Semantics: Why Superficiality, Dependency and Avoidance of Commitment can be the Right Way to Go , 2009, EACL.

[147]  Staffan Larsson,et al.  Formal semantics for perceptual classification PREPRINT VERSION , 2022 .

[148]  Edward Grefenstette,et al.  Towards a Formal Distributional Semantics: Simulating Logical Calculi with Tensors , 2013, *SEMEVAL.

[149]  Kees van Deemter Not Exactly: In Praise of Vagueness , 2010 .

[150]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[151]  Ido Dagan,et al.  Directional Distributional Similarity for Lexical Expansion , 2009, ACL/IJCNLP.

[152]  Richard Bergmair,et al.  Monte Carlo semantics : robust inference and logical pattern processing with natural language text , 2011 .

[153]  L. McNally Kinds, descriptions of kinds, concepts, and distributions , 2017 .

[154]  Tom M. Mitchell,et al.  Vector Space Semantic Parsing: A Framework for Compositional Vector Space Models , 2013, CVSM@ACL.

[155]  Ann A. Copestake,et al.  Semantic Composition via Probabilistic Model Theory , 2017, IWCS.

[156]  Katrin Erk,et al.  A Structured Vector Space Model for Word Meaning in Context , 2008, EMNLP.

[157]  Stefan Thater,et al.  A comparison of models of word meaning in context , 2012, HLT-NAACL.

[158]  C. Williams,et al.  The Seas of Language , 1995 .

[159]  T. Zentall,et al.  Categorization, concept learning, and behavior analysis: an introduction. , 2002, Journal of the experimental analysis of behavior.

[160]  Aurélie Herbelot,et al.  Measuring semantic content in distributional vectors , 2013, ACL.

[161]  Martin Haspelmath,et al.  The indeterminacy of word segmentation and the nature of morphology and syntax , 2011 .

[162]  R. Millikan Language conventions made simple , 1998 .

[163]  Phil Blunsom,et al.  The Role of Syntax in Vector Space Models of Compositional Semantics , 2013, ACL.

[164]  Felix Hill,et al.  HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment , 2016, CL.

[165]  Peter M. S. Hacker,et al.  History of cognitive neuroscience , 2008 .

[166]  Terence Parsons,et al.  Events in the Semantics of English: A Study in Subatomic Semantics , 1990 .

[167]  Stephen Clark,et al.  From distributional semantics to feature norms: grounding semantic models in human perceptual data , 2015, IWCS.

[168]  Diane Pecher,et al.  Abstract concepts: sensory-motor grounding, metaphors, and beyond , 2011 .

[169]  Andrew McCallum,et al.  Word Representations via Gaussian Embedding , 2014, ICLR.

[170]  Mark S. Seidenberg,et al.  Semantic feature production norms for a large set of living and nonliving things , 2005, Behavior research methods.

[171]  Alex Lascarides,et al.  An Algebra for Semantic Construction in Constraint-based Grammars , 2001, ACL.

[172]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[173]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[174]  Dale A. Thorpe,et al.  The sorites paradox , 1984, Synthese.

[175]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[176]  Noah D. Goodman,et al.  Adjectival vagueness in a Bayesian model of interpretation , 2015, Synthese.

[177]  S. Clark,et al.  The Categorial Framework for Compositional Distributional Semantics , 2016 .

[178]  J. Searle The Background of Meaning , 1980 .

[179]  Katrin Erk Supporting inferences in semantic space: representing words as regions , 2009, IWCS.

[180]  Gregory V. Jones,et al.  Misremembering a common object: When left is not right , 1990, Memory & cognition.

[181]  Stephen Clark,et al.  Reducing Dimensions of Tensors in Type-Driven Distributional Semantics , 2014, EMNLP.

[182]  Stephen Clark,et al.  Grounding Semantics in Olfactory Perception , 2015, ACL.

[183]  Stephan Oepen,et al.  Discriminant-Based MRS Banking , 2006, LREC.

[184]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[185]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[186]  Horacio Saggion,et al.  SemEval-2018 Task 9: Hypernym Discovery , 2018, *SEMEVAL.

[187]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[188]  Edward Nelson,et al.  Syntax and Semantics , 1974 .

[189]  Vincze Veronika,et al.  Semi-Compositional Noun + Verb Constructions: Theoretical Questions and Computational Linguistic Analyses , 2013 .

[190]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[191]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[192]  FlickingerDan On building a more efficient grammar by exploiting types , 2000 .

[193]  Ann Copestake Semantic Composition with (Robust) Minimal Recursion Semantics , 2007, ACL 2007.

[194]  Stevan Harnad,et al.  How is Meaning Grounded in Dictionary Definitions? , 2008, COLING 2008.

[195]  Mehrnoosh Sadrzadeh,et al.  Multi-Step Regression Learning for Compositional Distributional Semantics , 2013, IWCS.

[196]  D. Westerståhl Formal Semantics : an Introduction , 2013 .

[197]  Martin Kay Does a Computational Linguist have to be a Linguist? , 2014, COLING.

[198]  Marie Schmidt,et al.  Natural Language Semantics , 2016 .

[199]  I. A. Richards,et al.  The Meaning of Meaning: a Study of the Influence of Language upon Thought and of the Science of Symbolism , 1923, Nature.

[200]  Tom M. Mitchell,et al.  Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding , 2012, COLING.

[201]  Weiwei Sun,et al.  Accurate SHRG-Based Semantic Parsing , 2018, ACL.

[202]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[203]  Willard Van Orman Quine,et al.  Word and Object , 1960 .

[204]  Stefan Thater,et al.  Word Meaning in Context: A Simple and Effective Vector Model , 2011, IJCNLP.

[205]  Laura Rimell,et al.  Distributional Lexical Entailment by Topic Coherence , 2014, EACL.

[206]  David Schlangen,et al.  Resolving References to Objects in Photographs using the Words-As-Classifiers Model , 2015, ACL.

[207]  David J. Weir,et al.  Learning to Distinguish Hypernyms and Co-Hyponyms , 2014, COLING.

[208]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[209]  Ann A. Copestake Lexicalised compositionality , 2011 .

[210]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[211]  Aurélie Herbelot,et al.  Underspecified quantification , 2010 .

[212]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[213]  E. Nagel Meaning and Necessity. A Study in Semantics and Modal Logic , 1948 .

[214]  Katrin Erk,et al.  Integrating Logical Representations with Probabilistic Information using Markov Logic , 2011, IWCS.

[215]  Marco Baroni,et al.  Frege in Space: A Program for Composition Distributional Semantics , 2014, LILT.

[216]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[217]  Peter R. Sutton Towards a Probabilistic Semantics for Vague Adjectives , 2015 .

[218]  François Recanati,et al.  Composition Ality, Flexibility, And Context Dependence , 2012 .

[219]  Katrin Erk,et al.  Vector Space Models of Word Meaning and Phrase Meaning: A Survey , 2012, Lang. Linguistics Compass.

[220]  Aurélie Herbelot,et al.  Linguistic Issues in Language Technology – LiLT , 2015 .

[221]  Katrin Erk,et al.  What Is Word Meaning, Really? (And How Can Distributional Models Help Us Describe It?) , 2010 .

[222]  Stephan Oepen,et al.  Towards Comparability of Linguistic Graph Banks for Semantic Parsing , 2016, LREC.

[223]  Daniela Gerz,et al.  Scoring Lexical Entailment with a Supervised Directional Similarity Network , 2018, ACL.

[224]  Stephen Clark,et al.  Evaluation of Simple Distributional Compositional Operations on Longer Texts , 2014, LREC.

[225]  Dimitri Kartsaklis,et al.  Separating Disambiguation from Composition in Distributional Semantics , 2013, CoNLL.

[226]  Aurélie Herbelot,et al.  Mr Darcy and Mr Toad, gentlemen: distributional names and their kinds , 2015, IWCS.

[227]  Ruslan Salakhutdinov,et al.  Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.

[228]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[229]  Laura Kallmeyer,et al.  Random Positive-Only Projections: PPMI-Enabled Incremental Semantic Space Construction , 2016, *SEM@ACL.

[230]  Marco Baroni,et al.  A practical and linguistically-motivated approach to compositional distributional semantics , 2014, ACL.

[231]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[232]  Stephen Clark,et al.  An Exploration of Discourse-Based Sentence Spaces for Compositional Distributional Semantics , 2015, LSDSem@EMNLP.

[233]  Lotfi A. Zadeh,et al.  The Concepts of a Linguistic Variable and its Application to Approximate Reasoning , 1975 .

[234]  Karen Spärck Jones Statistics and Retrieval: Past and Future , 2007, 2007 International Conference on Computing: Theory and Applications (ICCTA'07).

[235]  Tom Michael Mitchell,et al.  Predicting Human Brain Activity Associated with the Meanings of Nouns , 2008, Science.

[236]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[237]  Francis Jeffry Pelletier,et al.  The Generic book , 1997 .

[238]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[239]  Gregory Norman Carlson,et al.  Reference to kinds in English , 1977 .

[240]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[241]  Ivan Titov,et al.  Embedding Words as Distributions with a Bayesian Skip-gram Model , 2017, COLING.

[242]  Sanja Fidler,et al.  Order-Embeddings of Images and Language , 2015, ICLR.

[243]  J. Elman On the Meaning of Words and Dinosaur Bones: Lexical Knowledge Without a Lexicon , 2009, Cogn. Sci..

[244]  Baobao Chang,et al.  Inducing Word Sense with Automatically Learned Hidden Concepts , 2014, COLING.

[245]  Xiang Li,et al.  Improved Representation Learning for Predicting Commonsense Ontologies , 2017, ArXiv.

[246]  William E. Nagy,et al.  Learning Word Meanings From Context During Normal Reading , 1987 .

[247]  Uwe Reyle,et al.  From Discourse to Logic - Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory , 1993, Studies in linguistics and philosophy.

[248]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[249]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[250]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[251]  Yulia Tsvetkov,et al.  Sparse Overcomplete Word Vector Representations , 2015, ACL.

[252]  Aurélie Herbelot What is in a text, what isn't, and what this has to do with lexical semantics , 2013, IWCS.

[253]  Daniel Fried,et al.  Low-Rank Tensors for Verbs in Compositional Distributional Semantics , 2015, ACL.

[254]  Christopher D. Manning,et al.  Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .