Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

Semantic word spaces have been very useful but cannot express the meaning of longer phrases in a principled way. Further progress towards understanding compositionality in tasks such as sentiment detection requires richer supervised training and evaluation resources and more powerful models of composition. To remedy this, we introduce a Sentiment Treebank. It includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality. To address them, we introduce the Recursive Neural Tensor Network. When trained on the new treebank, this model outperforms all previous methods on several metrics. It pushes the state of the art in single sentence positive/negative classification from 80% up to 85.4%. The accuracy of predicting fine-grained sentiment labels for all phrases reaches 80.7%, an improvement of 9.7% over bag of features baselines. Lastly, it is the only model that can accurately capture the effects of negation and its scope at various tree levels for both positive and negative phrases.

[1]  D. Blakemore Denial and contrast: A relevance theoretic analysis ofbut , 1989 .

[2]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[3]  Geoffrey E. Hinton Mapping Part-Whole Hierarchies into Connectionist Networks , 1990, Artif. Intell..

[4]  Alexis Kalokerinos A natural history of negation , 1991 .

[5]  Tony A. Plate,et al.  Holographic reduced representations , 1995, IEEE Trans. Neural Networks.

[6]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[7]  Arthur Merin,et al.  Information, relevance, and social decisionmaking: some principles and results of decision-theoretic semantics , 1999 .

[8]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[9]  Michael Israel,et al.  Minimizers, Maximizers and the Rhetoric of Scalar Reasoning , 2001, J. Semant..

[10]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[11]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[12]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[13]  Annie Zaenen,et al.  Contextual Valence Shifters , 2006, Computing Attitude and Affect in Text.

[14]  Regina Barzilay,et al.  Multiple Aspect Ranking Using the Good Grief Algorithm , 2007, NAACL.

[15]  Stephen Clark,et al.  Combining Symbolic and Distributional Models of Meaning , 2007, AAAI Spring Symposium: Quantum Interaction.

[16]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[17]  Karo Moilanen,et al.  Sentiment Composition , 2007 .

[18]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[19]  Dominic Widdows,et al.  Semantic Vector Products: Some Initial Investigations , 2008 .

[20]  Katrin Erk,et al.  A Structured Vector Space Model for Word Meaning in Context , 2008, EMNLP.

[21]  Stephen G. Pulman,et al.  Multi-entity Sentiment Scoring , 2009, RANLP.

[22]  Joshua B. Tenenbaum,et al.  Modelling Relational Data using Bayesian Clustered Tensor Factorization , 2009, NIPS.

[23]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[24]  George A. Vouros,et al.  United we Stand: Improving Sentiment Analysis by Joining Machine Learning and Rule Based Methods , 2010, LREC.

[25]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[26]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[27]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[28]  Christopher D. Manning,et al.  Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[29]  Ioannis Korkontzelos,et al.  Estimating Linear Models for Compositional Distributional Semantics , 2010, COLING.

[30]  Sebastian Rudolph,et al.  Compositional Matrix-Space Models of Language , 2010, ACL.

[31]  Kentaro Inui,et al.  Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables , 2010, NAACL.

[32]  Geoffrey E. Hinton,et al.  Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images , 2010, AISTATS.

[33]  Claire Cardie,et al.  Compositional Matrix-Space Models for Sentiment Analysis , 2011, EMNLP.

[34]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[35]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[36]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[37]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[38]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[39]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[40]  Dong Yu,et al.  Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks , 2012, INTERSPEECH.

[41]  Shrikanth S. Narayanan,et al.  A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle , 2012, ACL.

[42]  Mehrnoosh Sadrzadeh,et al.  Multi-Step Regression Learning for Compositional Distributional Semantics , 2013, IWCS.

[43]  Léon Bottou,et al.  From machine learning to machine reasoning , 2011, Machine Learning.