Different Feature Selection for Sentiment Classification

Sentiment Analysis (SA) research has increased tremendously in recent times .Sentiment analysis means to extract opinion of users from review documents. Sentiment classification using Machine learning (ML ) methods faces the problem of high dimensionality of feature vector. Therefore, a feature selection method is required to eliminate the irrelevant and noisy features from the feature vector for efficient working of ML algorithms Rough set theory provides an important concept for feature reduction called reduct. The cost of reduct set computation is highly influenced by the attribute size of the dataset where the problem of finding reducts has been proven as NP- hard problems.Different feature selection are applied on different data set, Experimental results show that mRMR is better compared to IG for sentiment classification, Hybrid feature selection method based on the RST and Information Gain (IG) is better compared to the previous methods. Proposed methods are evaluated on four standard datasets viz. Movie review, Product (book, DVD, and electronics) reviewed datasets, and Experimental results show that hybrid feature selection method outperforms than feature selection methods for sentimental classification.

[1]  Z. Pawlak Rough set approach to knowledge-based decision support , 1997 .

[2]  Timothy O'Keefe Feature Selection and Weighting Methods in Sentiment Analysis , 2009 .

[3]  Deyu Li,et al.  A Feature Selection Method Based on Fisher's Discriminant Ratio for Text Sentiment Classification , 2009, WISM.

[4]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[5]  Andrzej Skowron,et al.  Rough set methods in feature selection and recognition , 2003, Pattern Recognit. Lett..

[6]  M. Z. Rashad,et al.  Robopinion: Opinion Mining Framework Inspired by Autonomous Robot Navigation , 2012, ArXiv.

[7]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[8]  Namita Mittal,et al.  Categorical Probability Proportion Difference (CPPD): A Feature Selection Method for Sentiment Classification , 2012 .

[9]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[10]  Sang-goo Lee,et al.  Opinion mining of customer feedback data on the web , 2008, ICUIMC '08.

[11]  Kamal Nigam,et al.  Towards a Robust Metric of Opinion , 2004 .

[12]  Duoqian Miao,et al.  A rough set approach to feature selection based on ant colony optimization , 2010, Pattern Recognit. Lett..

[13]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[14]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[15]  Fei Song,et al.  Comparison of Feature Selection Methods for Sentiment Analysis , 2010, Canadian Conference on AI.

[16]  Janusz Zalewski,et al.  Rough sets: Theoretical aspects of reasoning about data , 1996 .

[17]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[18]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[19]  Jin Zhang,et al.  An empirical study of sentiment analysis for chinese documents , 2008, Expert Syst. Appl..

[20]  Vibhu O. Mittal,et al.  Comparative Experiments on Sentiment Classification for Online Product Reviews , 2006, AAAI.

[21]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[22]  Qiang Shen,et al.  Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approaches , 2004, IEEE Transactions on Knowledge and Data Engineering.

[23]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.