Cross-Domain Sentiment Analysis Employing Different Feature Selection and Classification Techniques

The paramount work of information mustering has been to find out what is the opinion of the people. Sentiment analysis is errand discerning the polarity for the given content which is dichotomized into two categories—positive and negative. Sentiment analysis operates on colossal feature sets of unique terms using bag of words (BOW) slant, in which case discrete attributes do not give factual information. This necessitates the elimination of extraneous and inconsequential terms from the feature set. Another challenging fact is most of the times, the training data might not be of the particular domain for which the perusal of test data is needed. This miscellany of challenges is unfolded by probing feature selection (FS) methods in cross-domain sentiment analysis. The boon of cross-domain and Feature Selection methods lies in significantly less computational power and time for processing. The informative features chosen are employed for training the classifier and investigating their execution for classification in terms of accuracy. Experimentation of FS methods (IG, GR, CHI, SAE) was performed on standard dataset viz. Amazon product review dataset and TripAdvisor dataset with NB, SVM, DT, and KNN classifiers. The paper works on different techniques by which cross-domain analysis vanquishes, despite the lower accuracy due to difference in domains, as better algorithmic efficient method.

[1]  Deyu Li,et al.  A feature selection method based on improved fisher's discriminant ratio for text sentiment classification , 2011, Expert Syst. Appl..

[2]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[3]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[4]  Paolo Gastaldo,et al.  Data intensive review mining for sentiment classification across heterogeneous domains , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[5]  Hsinchun Chen,et al.  A Lexicon-Enhanced Method for Sentiment Classification: An Experiment on Online Product Reviews , 2010, IEEE Intelligent Systems.

[6]  Erik Cambria,et al.  Sentic Computing: Techniques, Tools, and Applications , 2012 .

[7]  Jin Zhang,et al.  An empirical study of sentiment analysis for chinese documents , 2008, Expert Syst. Appl..

[8]  Ronald R. Yager,et al.  WebPET: An Online Tool for Lexicographic Decision Making , 2010, IEEE Intelligent Systems.

[9]  Grzegorz Kondrak,et al.  A Comparison of Sentiment Analysis Techniques: Polarizing Movie Blogs , 2008, Canadian Conference on AI.

[10]  Yue Lu,et al.  Latent aspect rating analysis on review text data: a rating regression approach , 2010, KDD.

[11]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[12]  Hsinchun Chen,et al.  AI and Opinion Mining , 2010, IEEE Intelligent Systems.

[13]  Lipika Dey,et al.  A feature selection technique for classificatory analysis , 2005, Pattern Recognit. Lett..

[14]  Danushka Bollegala,et al.  Cross-Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus , 2013, IEEE Transactions on Knowledge and Data Engineering.

[15]  Shubhamoy Dey,et al.  A comparative study of feature selection and machine learning techniques for sentiment analysis , 2012, RACS.

[16]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[17]  Rui Xia,et al.  Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..

[18]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.