Sentiment Classification with Graph Co-Regularization

Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of user-generated sentiment data (e.g., reviews, blogs). To obtain sentiment classification with high accuracy, supervised techniques require a large amount of manually labeled data. The labeling work can be time-consuming and expensive, which makes unsupervised (or semisupervised) sentiment analysis essential for this application. In this paper, we propose a novel algorithm, called graph co-regularized non-negative matrix tri-factorization (GNMTF), from the geometric perspective. GNMTF assumes that if two words (or documents) are sufficiently close to each other, they tend to share the same sentiment polarity. To achieve this, we encode the geometric information by constructing the nearest neighbor graphs, in conjunction with a nonnegative matrix tri-factorization framework. We derive an efficient algorithm for learning the factorization, analyze its complexity, and provide proof of convergence. Our empirical study on two open data sets validates that GNMTF can consistently improve the sentiment classification accuracy in comparison to the state-of-the-art methods.

[1]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[2]  Xiaojun Wan,et al.  Co-Training for Cross-Lingual Sentiment Classification , 2009, ACL.

[3]  Yue Lu,et al.  Automatic construction of a context-aware sentiment lexicon: an optimization approach , 2011, WWW.

[4]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[5]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[6]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[7]  Inderjit S. Dhillon,et al.  Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data , 2004, SDM.

[8]  Nan Sun,et al.  Exploiting internal and external semantics for the clustering of short texts using world knowledge , 2009, CIKM.

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  Guodong Zhou,et al.  Semi-Supervised Learning for Imbalanced Sentiment Classification , 2011, IJCAI.

[11]  Claire Cardie,et al.  Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora , 2011, ACL.

[12]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[13]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[14]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[15]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[16]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[17]  Fang Liu,et al.  Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization , 2013, ACL.

[18]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[19]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[20]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[21]  Wei Peng,et al.  Generate Adjective Sentiment Dictionary for Social Media Sentiment Analysis Using Constrained Nonnegative Matrix Factorization , 2021, ICWSM.

[22]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[23]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[24]  Hyunsoo Kim,et al.  Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method , 2008, SIAM J. Matrix Anal. Appl..

[25]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[26]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[27]  Houfeng Wang,et al.  Cross-Lingual Mixture Model for Sentiment Classification , 2012, ACL.

[28]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[29]  Danushka Bollegala,et al.  Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification , 2011, ACL.

[30]  Huan Liu,et al.  Unsupervised sentiment analysis with emotional signals , 2013, WWW.

[31]  Xiaojin Zhu,et al.  Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization , 2006 .

[32]  Vikas Sindhwani,et al.  Document-Word Co-regularization for Semi-supervised Sentiment Analysis , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[33]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[34]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[35]  Harith Alani,et al.  Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification , 2011, ACL.

[36]  Thomas S. Huang,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation. , 2011, IEEE transactions on pattern analysis and machine intelligence.

[37]  Tao Li,et al.  A Non-negative Matrix Tri-factorization Approach to Sentiment Classification with Lexical Prior Knowledge , 2009, ACL.