Social emotion classification based on noise-aware training

Abstract Social emotion classification draws many natural language processing researchers’ attention in recent years, since analyzing user-generated emotional documents on the Web is quite useful in recommending products, gathering public opinions, and predicting election results. However, the documents that evoke prominent social emotions are usually mixed with noisy instances, and it is also challenging to capture the textual meaning of short messages. In this work, we focus on reducing the impact of noisy instances and learning a better representation of sentences. For the former, we introduce an “emotional concentration” indicator, which is derived from emotional ratings to weight documents. For the latter, we propose a new architecture named PCNN, which utilizes two cascading convolutional layers to model the word-phrase relation and the phrase-sentence relation. This model regards continuous tokens as phrases based on an assumption that neighboring words are very likely to have internal relations, and semantic feature vectors are generated based on the phrase representation. We also present a Bayesian-based model named WMCM to learn document-level semantic features. Both PCNN and WMCM classify social emotions by capturing semantic regularities in language. Experiments on two real-world datasets indicate that the quality of learned semantic vectors and the performance of social emotion classification can be improved by our models.

[1]  Xin Li,et al.  Weighted multi-label classification model for sentiment analysis of online news , 2016, 2016 International Conference on Big Data and Smart Computing (BigComp).

[2]  Richard Wicentowski,et al.  SWAT-MP:The SemEval-2007 Systems for Task 5 and Task 14 , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[3]  Yunming Ye,et al.  Dynamic Business Network Analysis for Correlated Stock Price Movement Prediction , 2015, IEEE Intelligent Systems.

[4]  Milos Hauskrecht,et al.  A Generalized Mixture Framework for Multi-label Classification , 2015, SDM.

[5]  Daling Wang,et al.  Multi-label Chinese Microblog Emotion Classification via Convolutional Neural Network , 2016, APWeb.

[6]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[9]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[10]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[11]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Haoran Xie,et al.  Does Summarization Help Stock Prediction? A News Impact Analysis , 2015, IEEE Intelligent Systems.

[13]  Ting Liu,et al.  Learning Sentence Representation for Emotion Classification on Microblogs , 2013, NLPCC.

[14]  Changqin Quan,et al.  An Exploration of Features for Recognizing Word Emotion , 2010, COLING.

[15]  Claire Cardie,et al.  Annotating Topics of Opinions , 2008, LREC.

[16]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[17]  Razvan C. Bunescu,et al.  Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques , 2003, Third IEEE International Conference on Data Mining.

[18]  Kyoungok Kim,et al.  Sentiment visualization and classification via semi-supervised nonlinear dimensionality reduction , 2014, Pattern Recognit..

[19]  Rabab Kreidieh Ward,et al.  Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[20]  Ling Shao,et al.  A rapid learning algorithm for vehicle classification , 2015, Inf. Sci..

[21]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[22]  Bin Gu,et al.  A Robust Regularization Path Algorithm for $\nu $ -Support Vector Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Jiafeng Guo,et al.  BTM: Topic Modeling over Short Texts , 2014, IEEE Transactions on Knowledge and Data Engineering.

[24]  Eyke Hüllermeier,et al.  On label dependence and loss minimization in multi-label classification , 2012, Machine Learning.

[25]  Lei Tang,et al.  Large scale multi-label classification via metalabeler , 2009, WWW '09.

[26]  Matt Taddy,et al.  On Estimation and Selection for Topic Models , 2011, AISTATS.

[27]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[28]  Hsin-Hsi Chen,et al.  Ranking Reader Emotions Using Pairwise Loss Minimization and Emotional Distribution Regression , 2008, EMNLP.

[29]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[30]  Yanghui Rao,et al.  Contextual Sentiment Topic Model for Adaptive Social Emotion Classification , 2016, IEEE Intelligent Systems.

[31]  Mingliang Chen,et al.  Building emotional dictionary for sentiment analysis of online news , 2014, World Wide Web.

[32]  Huan Liu,et al.  Exploiting social relations for sentiment analysis in microblogging , 2013, WSDM.

[33]  Mike Y. Chen,et al.  Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web , 2001 .

[34]  Paolo Giudici,et al.  Applied Data Mining: Statistical Methods for Business and Industry , 2003 .

[35]  Yanghui Rao,et al.  Sentiment topic models for social emotion mining , 2014, Inf. Sci..

[36]  Eyke Hüllermeier,et al.  Dependent binary relevance models for multi-label classification , 2014, Pattern Recognit..

[37]  Sheng Wang,et al.  SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis , 2014, AAAI.

[38]  Ting Liu,et al.  Learning Semantic Representations of Users and Products for Document Level Sentiment Classification , 2015, ACL.

[39]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[40]  Bin Gu,et al.  Incremental Support Vector Learning for Ordinal Regression , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[42]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[43]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[44]  Sebastián Ventura,et al.  A Tutorial on Multilabel Learning , 2015, ACM Comput. Surv..

[45]  Carlo Strapparava,et al.  SemEval-2007 Task 14: Affective Text , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[46]  Xin Li,et al.  Social Emotion Classification via Reader Perspective Weighted Model , 2016, AAAI.

[47]  Bin Gu,et al.  Incremental learning for ν-Support Vector Regression , 2015, Neural Networks.

[48]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[49]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[50]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[51]  Bowen Zhou,et al.  Dependency-based Convolutional Neural Networks for Sentence Embedding , 2015, ACL.

[52]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[53]  Roberto V. Zicari,et al.  PoliTwi: Early detection of emerging political topics on twitter and the impact on concept-level sentiment analysis , 2014, Knowl. Based Syst..

[54]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[55]  Ming Zhou,et al.  Hierarchical Recurrent Neural Network for Document Modeling , 2015, EMNLP.

[56]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[57]  Ivica Dimitrovski,et al.  Emotion identification in FIFA world cup tweets using convolutional neural network , 2015, 2015 11th International Conference on Innovations in Information Technology (IIT).

[58]  Rong Yan,et al.  Mining Social Emotions from Affective Text , 2012, IEEE Transactions on Knowledge and Data Engineering.

[59]  Saso Dzeroski,et al.  Decision trees for hierarchical multi-label classification , 2008, Machine Learning.

[60]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[61]  David Bell,et al.  Microblogging as a mechanism for human-robot interaction , 2014, Knowl. Based Syst..

[62]  Luis Alfonso Ureña López,et al.  Crowd explicit sentiment analysis , 2014, Knowl. Based Syst..

[63]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[64]  Zhi Jin,et al.  Discriminative Neural Sentence Modeling by Tree-Based Convolution , 2015, EMNLP.

[65]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[66]  Rong Yan,et al.  Joint Emotion-Topic Modeling for Social Affective Text Mining , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[67]  Wenyin Liu,et al.  Affective topic model for social emotion detection , 2014, Neural Networks.

[68]  Yee Whye Teh,et al.  Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes , 2004, NIPS.

[69]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[70]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[71]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[72]  Long Jiang,et al.  User-level sentiment analysis incorporating social networks , 2011, KDD.

[73]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[74]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[75]  Andrew McCallum,et al.  Collective multi-label classification , 2005, CIKM '05.

[76]  Li Chen,et al.  News impact on stock price return via sentiment analysis , 2014, Knowl. Based Syst..

[77]  Lei Huang,et al.  Sentence-level Emotion Classification with Label and Context Dependence , 2015, ACL.