CSNN: Contextual Sentiment Neural Network

Although deep neural networks are excellent for text sentiment analysis, their applications in real-world practice are occasionally limited owing to their black-box property. In response, we propose a novel neural network model called contextual sentiment neural network (CSNN) model that can explain the process of its sentiment analysis prediction in a way that humans find natural and agreeable. The CSNN has the following interpretable layers: the word-level original sentiment layer, word-level sentiment shift layer, word-level local contextual sentiment layer, word-level global importance layer, and word-level global contextual sentiment layer. Because of these layers, this network can explain the process of its document-level sentiment analysis results in a human-like way using these layers. Realizing the interpretability of each layer in the CSNN is a crucial problem in the development of this CSNN because the general back-propagation method cannot realize such interpretability. To realize this interpretability, we propose a novel learning strategy called initialization propagation (IP) learning. Using real textual datasets, we experimentally demonstrate that the proposed IP learning is effective for improving the interpretability of each layer in CSNN. We then experimentally demonstrate that both the predictability and explanation ability of the CSNN are high.

[1]  Sadaaki Miyamoto,et al.  Spherical k-Means++ Clustering , 2015, MDAI.

[2]  Qinmin Hu,et al.  SNNN: Promoting Word Sentiment and Negation in Neural Sentiment Classification , 2018, AAAI.

[3]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[4]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[5]  Quanshi Zhang,et al.  Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Bonnie L. Webber,et al.  Neural Networks For Negation Scope Detection , 2016, ACL.

[7]  Yotam Hechtlinger,et al.  Interpretation of Prediction Models Using the Input Gradient , 2016, ArXiv.

[8]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[9]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[10]  Kiyoshi Izumi,et al.  Text-Visualizing Neural Network Model: Understanding Online Financial Textual Data , 2018, PAKDD.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[13]  Xuanjing Huang,et al.  A Lexicon-Based Supervised Attention Model for Neural Sentiment Analysis , 2018, COLING.

[14]  Klaus-Robert Müller,et al.  Explaining Recurrent Neural Network Predictions in Sentiment Analysis , 2017, WASSA@EMNLP.

[15]  Chu-Ren Huang,et al.  Sentiment Classification and Polarity Shifting , 2010, COLING.

[16]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[17]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[18]  Ming Zhou,et al.  Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.

[19]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[20]  Alexander Binder,et al.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[21]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[22]  Chu-Ren Huang,et al.  Sentiment Classification with Polarity Shifting Detection , 2013, 2013 International Conference on Asian Language Processing.

[23]  Yue Zhang,et al.  Don’t Count, Predict! An Automatic Approach to Learning Sentiment Lexicons for Short Text , 2016, ACL.

[24]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[25]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[26]  Avanti Shrikumar,et al.  Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[27]  Sameena Shah,et al.  Learning Stock Market Sentiment Lexicon and Sentiment-Oriented Word Vector from StockTwits , 2017, CoNLL.