Identifying Structural Holes for Sentiment Classification

The prevalence of online user-generated content has attracted great interest in textual sentiment analysis, which provides a low-cost yet effective way to discern consumers and markets. A mainstream of sentiment analysis is to construct a classification model with Bag-of-Words (BoW) features, but the large vocabulary base and skewed distribution of term frequency consistently pose research challenges, which is made even worse by the limited valid sentiment labels. In light of this, in this paper, we propose a novel method called Structural Holes based Sentiment Classifier (SHSC) for BoW-based sentiment classification. The key to SHSC is to reinforce the classification contribution of semantically rich words with clear-cut sentiment polarity. To this end, a word co-occurrence network is carefully constructed to represent both high and low frequency words. The work to find classification-inefficient words is then transformed into the identification of so-called bridge nodes that occupy the positions of structural holes in the network. Two interesting measures, i.e., information advantage rank and control advantage weight, are then designed elaborately for this purpose, which are based on the proposed sentiment-label propagation and short-path computation algorithms, respectively. SHSC finally feeds this information as the key regularizers into a simple regression model to guide parametric learning. Extensive experiments on real-world text datasets demonstrate the advantage of our SHSC model over competitive benchmarks, particularly when sentiment labels are scarce. The effectiveness of uncovering structural holes for sentiment classification is also carefully verified with some robustness checks and demonstration cases.

[1]  Kam-Fai Wong,et al.  Web 2.0 Environmental Scanning and Adaptive Decision Support for Business Mergers and Acquisitions , 2012, MIS Q..

[2]  Jie Zhang,et al.  Expert Blogs and Consumer Perceptions of Competing Brands , 2017, MIS Q..

[3]  Zhendong Niu,et al.  Automatic construction of domain-specific sentiment lexicon based on constrained label propagation , 2014, Knowl. Based Syst..

[4]  Yung-Ming Li,et al.  Deriving market intelligence from microblogs , 2013, Decis. Support Syst..

[5]  Fangzhao Wu,et al.  Microblog Sentiment Classification with Contextual Knowledge Regularization , 2015, AAAI.

[6]  Siddhartha Bhattacharyya,et al.  Large-Scale Network Analysis for Online Social Brand Advertising , 2016, MIS Q..

[7]  Hua Xu,et al.  Chinese comments sentiment classification based on word2vec and SVMperf , 2015, Expert Syst. Appl..

[8]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[9]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[10]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[11]  Zhiguang Liu,et al.  Reserved Self-training: A Semi-supervised Sentiment Classification Method for Chinese Microblogs , 2013, IJCNLP.

[12]  Fangzhao Wu,et al.  Sentiment Domain Adaptation with Multi-Level Contextual Sentiment Knowledge , 2016, CIKM.

[13]  S. AdhiHarmoko,et al.  Introduction to Algorithms , 2005 .

[14]  Yogesh Kumar Dwivedi,et al.  Smart Monitoring and Controlling of Government Policies Using Social Media and Cloud Computing , 2019, Information Systems Frontiers.

[15]  Habin Lee,et al.  Data properties and the performance of sentiment classification for electronic commerce applications , 2017, Inf. Syst. Frontiers.

[16]  Xiaolong Wang,et al.  Active deep learning method for semi-supervised sentiment classification , 2013, Neurocomputing.

[17]  Delip Rao,et al.  Semi-Supervised Polarity Lexicon Induction , 2009, EACL.

[18]  Xueqi Cheng,et al.  Adaptive co-training SVM for sentiment classification on tweets , 2013, CIKM.

[19]  Fangzhao Wu,et al.  Personalized Microblog Sentiment Classification via Multi-Task Learning , 2016, AAAI.

[20]  Dragomir R. Radev,et al.  A Random Walk–Based Model for Identifying Semantic Orientation , 2014, Computational Linguistics.

[21]  Fangzhao Wu,et al.  Collaboratively Training Sentiment Classifiers for Multiple Domains , 2017, IEEE Transactions on Knowledge and Data Engineering.

[22]  Andrea Esuli,et al.  PageRanking WordNet Synsets: An Application to Opinion Mining , 2007, ACL.

[23]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[24]  Fangzhao Wu,et al.  Towards building a high-quality microblog-specific Chinese sentiment lexicon , 2016, Decis. Support Syst..

[25]  Samuel W. K. Chan,et al.  Sentiment analysis in financial texts , 2017, Decis. Support Syst..

[26]  Usman Qamar,et al.  TOM: Twitter opinion mining framework using hybrid classification scheme , 2014, Decis. Support Syst..

[27]  R. Burt Structural Holes and Good Ideas1 , 2004, American Journal of Sociology.

[28]  Yong Shi,et al.  DWWP: Domain-specific new words detection and word propagation system for sentiment analysis in the tourism domain , 2018, Knowl. Based Syst..

[29]  Ray Qing Cao,et al.  Using sentiment analysis to improve supply chain intelligence , 2017, Information Systems Frontiers.

[30]  Michael Sedlmair,et al.  More than Bags of Words: Sentiment Analysis with Word Embeddings , 2018 .

[31]  Ming Zhou,et al.  Building Large-Scale Twitter-Specific Sentiment Lexicon : A Representation Learning Approach , 2014, COLING.

[32]  Luciano da Fontoura Costa,et al.  Distinguishing between Positive and Negative Opinions with Complex Network Features , 2010, TextGraphs@ACL.

[33]  Fei Song,et al.  Feature Selection for Sentiment Analysis Based on Content and Syntax Models , 2011, Decis. Support Syst..

[34]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[35]  Guodong Zhou,et al.  Semi-Supervised Learning for Imbalanced Sentiment Classification , 2011, IJCAI.

[36]  Ricard V. Solé,et al.  Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf’s Law Revisited* , 2001, J. Quant. Linguistics.

[37]  Sasha Blair-Goldensohn,et al.  The viability of web-derived polarity lexicons , 2010, NAACL.

[38]  Jun Yan,et al.  Sentence-level Sentiment Classification with Weak Supervision , 2017, SIGIR.

[39]  Huimin Zhao,et al.  Adapting sentiment lexicons to domain-specific social media texts , 2017, Decis. Support Syst..

[40]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..