论文信息 - Evaluating Compositionality of Sentence Representation Models

Evaluating Compositionality of Sentence Representation Models

We evaluate the compositionality of general-purpose sentence encoders by proposing two different metrics to quantify compositional understanding capability of sentence encoders. We introduce a novel metric, Polarity Sensitivity Scoring (PSS), which utilizes sentiment perturbations as a proxy for measuring compositionality. We then compare results from PSS with those obtained via our proposed extension of a metric called Tree Reconstruction Error (TRE) (CITATION) where compositionality is evaluated by measuring how well a true representation producing model can be approximated by a model that explicitly combines representations of its primitives.

[1] Nan Hua,et al. Universal Sentence Encoder for English , 2018, EMNLP.

[2] Yi Liu,et al. Teaching Compositionality to CNNs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Allyson Ettinger,et al. Probing for semantic evidence of composition by means of simple classification tasks , 2016, RepEval@ACL.

[4] Marco Baroni,et al. Linguistic generalization and compositionality in modern artificial neural networks , 2019, Philosophical Transactions of the Royal Society B.

[5] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[6] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[7] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[8] Mathijs Mul,et al. Compositionality Decomposed: How do Neural Networks Generalise? , 2019, J. Artif. Intell. Res..

[9] Marco Baroni,et al. Rearranging the Familiar: Testing Compositional Generalization in Recurrent Networks , 2018, BlackboxNLP@EMNLP.

[10] Marco Baroni,et al. Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.

[11] Xiao Wang,et al. Measuring Compositional Generalization: A Comprehensive Method on Realistic Data , 2019, ICLR.

[12] Percy Liang,et al. Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer , 2018, NAACL.

[13] Marco Baroni,et al. CNNs found to jump around more skillfully than RNNs: Compositional Generalization in Seq2seq Convolutional Networks , 2019, ACL.

[14] Jacob Andreas,et al. Measuring Compositionality in Representation Learning , 2019, ICLR.

[15] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16] Ido Dagan,et al. Still a Pain in the Neck: Evaluating Text Representations on Lexical Composition , 2019, TACL.

[17] Tom M. Mitchell,et al. A Compositional and Interpretable Semantic Space , 2015, NAACL.

[18] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.