Chinese sentiment classification using a neural network tool — Word2vec

Sentiment classification is the main and popular task in the field of sentiment analysis. Most of the existing researches focus on how to extract the effective features, such as lexical features and syntactic features, while limited work has been done on the extraction of semantic features, which can make more contributions to sentiment classification. This paper presents a method for sentiment classification based on word2vec. Word2vec is a tool, which establishes the neural network models to learn the vector representations of words in the high dimensional vector space. So it can extract the deep semantic relationships between words. In this paper, firstly, we cluster the similar features together using word2vec. And then we use word2vec again to learn the word representations as candidate feature vectors. After feature selection, the SVMperf package is adopted to train and classify the comment texts. To conduct the experiments, we collect a large number of Chinese comments on clothing products as data set. The experimental results show that the accuracy of sentiment classification is over 90 percent, which proves the effectiveness of proposed method for Chinese sentiment classification.

[1]  Maosong Sun,et al.  Experimental Study on Sentiment Classification of Chinese Review using Machine Learning Techniques , 2007, 2007 International Conference on Natural Language Processing and Knowledge Engineering.

[2]  Kiran Bhowmick,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2015 .

[3]  Claire Cardie,et al.  Multi-Level Structured Models for Document-Level Sentiment Classification , 2010, EMNLP.

[4]  Hua Xu,et al.  Grouping Product Features Using Semi-Supervised Learning with Soft-Constraints , 2010, COLING.

[5]  Rui Xia,et al.  Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..

[6]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[7]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[8]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[9]  Guodong Zhou,et al.  Active Learning for Imbalanced Sentiment Classification , 2012, EMNLP.

[10]  Tao Wang,et al.  Dual Training and Dual Prediction for Polarity Classification , 2013, ACL.

[11]  Songbo Tan,et al.  A survey on sentiment detection of reviews , 2009, Expert Syst. Appl..

[12]  Hua Xu,et al.  Sentiment classification for Chinese reviews based on key substring features , 2009, 2009 International Conference on Natural Language Processing and Knowledge Engineering.

[13]  Hua Xu,et al.  Exploiting effective features for chinese sentiment classification , 2011, Expert Syst. Appl..

[14]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[15]  Thorsten Joachims,et al.  Sparse kernel SVMs via cutting-plane training , 2009, Machine Learning.

[16]  Suk Hwan Lim,et al.  Extracting and Ranking Product Features in Opinion Documents , 2010, COLING.

[17]  Hua Xu,et al.  Clustering product features for opinion mining , 2011, WSDM '11.

[18]  Jin Zhang,et al.  An empirical study of sentiment analysis for chinese documents , 2008, Expert Syst. Appl..

[19]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[20]  Pei Yin,et al.  Sentiment Feature Identification from Chinese Online Reviews , 2011 .

[21]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[22]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[23]  Hsinchun Chen,et al.  Selecting Attributes for Sentiment Classification Using Feature Relation Networks , 2011, IEEE Transactions on Knowledge and Data Engineering.

[24]  João Francisco Valiati,et al.  Document-level sentiment classification: An empirical comparison between SVM and ANN , 2013, Expert Syst. Appl..

[25]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[26]  Guodong Zhou,et al.  Semi-Supervised Learning for Imbalanced Sentiment Classification , 2011, IJCAI.

[27]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[28]  Deyu Li,et al.  A feature selection method based on improved fisher's discriminant ratio for text sentiment classification , 2011, Expert Syst. Appl..

[29]  Wessel Kraaij,et al.  A Shallow Approach to Subjectivity Classification , 2008, ICWSM.

[30]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[31]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[32]  Rui Xia,et al.  Exploring the Use of Word Relation Features for Sentiment Classification , 2010, COLING.