Feature selection for Chinese online reviews sentiment classification

Considering that traditional feature selection methods (DF, MI and IG) usually lost useful information, we propose the Feature Selection for Chinese Online Reviews Sentiment Classification (FSCSC), FSCSC takes empirical analysis into account and focus on how to effectively select different types of features based on statistical approaches to improve sentiment classification performance. FSCSC was tested on a Chinese online reviews corpus with a size of 4000 documents. The experiment indicates that FSCSC can improve the classification effectiveness.

[1]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[2]  Huang Zou,et al.  Sentiment Classification Using Machine Learning Techniques with Syntax Features , 2015, 2015 International Conference on Computational Science and Computational Intelligence (CSCI).

[3]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[4]  Cheng Xueqi Research on Sentiment Classification of Chinese Reviews Based on Supervised Machine Learning Techniques , 2007 .

[5]  Sotiris Kotsiantis,et al.  Text Classification Using Machine Learning Techniques , 2005 .

[6]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[7]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[8]  Hua Xu,et al.  Exploiting effective features for chinese sentiment classification , 2011, Expert Syst. Appl..

[9]  Jin Zhang,et al.  An empirical study of sentiment analysis for chinese documents , 2008, Expert Syst. Appl..

[10]  Xinying Xu,et al.  Hidden sentiment association in chinese web opinion mining , 2008, WWW.

[11]  Xianghua Fu,et al.  Multi-aspect sentiment analysis for Chinese online social reviews based on topic modeling and HowNet lexicon , 2013, Knowl. Based Syst..

[12]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[13]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[14]  Lin Chen,et al.  Term-frequency Based Feature Selection Methods for Text Categorization , 2010, 2010 Fourth International Conference on Genetic and Evolutionary Computing.