Probing for semantic evidence of composition by means of simple classification tasks

We propose a diagnostic method for probing specific information captured in vector representations of sentence meaning, via simple classification tasks with strategically constructed sentence sets. We identify some key types of semantic information that we might expect to be captured in sentence composition, and illustrate example classification tasks for targeting this information.

[1]  Lorna Balkan,et al.  Test Suites for Natural Language Processing , 1995, TC.

[2]  P. Resnik Selectional constraints: an information-theoretic model and its computational realization , 1996, Cognition.

[3]  Stan Szpakowicz,et al.  The Power of the TSNLP: Lessons from a Diagnostic Evaluation of a Broad-Coverage Parser , 2000, Canadian Conference on AI.

[4]  Katrin Erk,et al.  A Simple, Similarity-based Model for Selectional Preferences , 2007, ACL.

[5]  Mark Steedman,et al.  Unbounded Dependency Recovery for Parser Evaluation , 2009, EMNLP.

[6]  Diarmuid Ó Séaghdha Latent Variable Models of Selectional Preference , 2010, ACL.

[7]  Emily M. Bender,et al.  Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus , 2011, EMNLP.

[8]  Phil Blunsom,et al.  “Not not bad” is not “bad”: A distributional account of negation , 2013, CVSM@ACL.

[9]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[10]  Marco Marelli,et al.  A SICK cure for the evaluation of compositional distributional semantic models , 2014, LREC.

[11]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[12]  Alice Lai,et al.  Illinois-LH: A Denotational and Distributional Approach to Semantics , 2014, *SEMEVAL.

[13]  Steven M Frankland,et al.  An architecture for encoding sentence meaning in left mid-superior temporal cortex , 2015, Proceedings of the National Academy of Sciences.

[14]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[15]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[16]  Gemma Boleda,et al.  Distributional vectors encode referential attributes , 2015, EMNLP.

[17]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[18]  Yejin Choi,et al.  Connotation Frames: A Data-Driven Investigation , 2015, ACL.

[19]  Kevin Gimpel,et al.  Towards Universal Paraphrastic Sentence Embeddings , 2015, ICLR.

[20]  Marine Carpuat,et al.  Learning Monolingual Compositional Representations via Bilingual Supervision , 2016, ACL.

[21]  Marco Marelli,et al.  SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment , 2016, Language Resources and Evaluation.

[22]  Xinlei Chen,et al.  Visualizing and Understanding Neural Models in NLP , 2015, NAACL.