Exploring and inferring user–user pseudo‐friendship for sentiment analysis with heterogeneous networks

With the development of social media and social networks, user-generated content, such as forums, blogs and comments, are not only getting richer, but also ubiquitously interconnected with many other objects and entities, forming a heterogeneous information network between them. Sentiment analysis on such kinds of data can no longer ignore the information network, since it carries a lot of rich and valuable information, explicitly or implicitly, where some of them can be observed while others are not. However, most existing methods may heavily rely on the observed user-user friendship or similarity between objects, and can only handle a subgraph associated with a single topic. None of them takes into account the hidden and implicit dissimilarity, opposite opinions, and foe relationship. In this paper, we propose a novel information network-based framework which can infer hidden similarity and dissimilarity between users by exploring similar and opposite opinions, so as to improve post-level and user-level sentiment classification at the same time. More specifically, we develop a new meta path-based measure for inferring pseudo-friendship as well as dissimilarity between users, and propose a semi-supervised refining model by encoding similarity and dissimilarity from both user-level and post-level relations. We extensively evaluate the proposed approach and compare with several state-of-the-art techniques on two real-world forum datasets. Experimental results show that our proposed model with 10.5% labeled samples can achieve better performance than a traditional supervised model trained on 61.7% data samples.

[1]  Ramakrishnan Srikant,et al.  Mining newsgroups using networks arising from social behavior , 2003, WWW '03.

[2]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[3]  Jie Tang,et al.  Inferring social ties across heterogenous networks , 2012, WSDM '12.

[4]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[5]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[6]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[7]  Thomas Hofmann,et al.  Semi-supervised Learning on Directed Graphs , 2004, NIPS.

[8]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[9]  Long Jiang,et al.  User-level sentiment analysis incorporating social networks , 2011, KDD.

[10]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[11]  Zhi-Hua Zhou,et al.  Exploiting Multi-Modal Interactions: A Unified Framework , 2009, IJCAI.

[12]  Morroe Berger,et al.  Freedom and control in modern society , 1954 .

[13]  Jacob Ratkiewicz,et al.  Predicting the Political Alignment of Twitter Users , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[14]  Stephen J. Wright,et al.  Dissimilarity in Graph-Based Semi-Supervised Classification , 2007, AISTATS.

[15]  P. Lazarsfeld,et al.  Friendship as Social process: a substantive and methodological analysis , 1964 .

[16]  Xiaojin Zhu,et al.  Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization , 2006 .

[17]  Dipankar Das,et al.  Identifying Event-Sentiment Association using Lexical Equivalence and Co-reference Approaches , 2011, RELMS@ACL.

[18]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[19]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[20]  Raymond H. Putra,et al.  Support or Oppose? Classifying Positions in Online Debates from Reply Activities and Opinion Expressions , 2010, COLING.

[21]  Noémie Elhadad,et al.  An Unsupervised Aspect-Sentiment Model for Online Reviews , 2010, NAACL.

[22]  Hongbo Deng,et al.  Effective latent space graph-based re-ranking model with global consistency , 2009, WSDM '09.

[23]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[24]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[25]  Alan F. Smeaton,et al.  Classifying sentiment in microblogs: is brevity an advantage? , 2010, CIKM.

[26]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[27]  Bo Zhao,et al.  Probabilistic topic models with biased propagation on heterogeneous information networks , 2011, KDD.

[28]  Yue Lu,et al.  Unsupervised discovery of opposing opinion networks from forum discussions , 2012, CIKM '12.

[29]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[30]  Jacob Ratkiewicz,et al.  Political Polarization on Twitter , 2011, ICWSM.

[31]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[32]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[33]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[34]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[36]  Murphy Choy,et al.  A sentiment analysis of Singapore Presidential Election 2011 using Twitter data with census correction , 2011, ArXiv.

[37]  Jure Leskovec,et al.  Predicting positive and negative links in online social networks , 2010, WWW '10.

[38]  Fadi Biadsy,et al.  Contextual Phrase-Level Polarity Analysis Using Lexical Affect Scoring and Syntactic N-Grams , 2009, EACL.

[39]  Jason Baldridge,et al.  Twitter Polarity Classification with Label Propagation over Lexical Links and the Follower Graph , 2011, ULNLP@EMNLP.

[40]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.