Feature Reduction using Principal Component Analysis for Opinion Mining

Opinions express viewpoints of users, and reviews gives information about how a product is perceived. Online reviews are now popularly used for judging quality of product or service and influence decision making of the users while selecting a product or service. Opinions are increasingly available in form of reviews and feedback at websites, blogs, and microblogs which influences future customers. As it is not feasible to manually handle the huge amount of opinions generated online, Opinion mining uses automatic processes for extracting reviews and discriminate relevant information with sentiment orientation. In this paper, it is proposed to extract the feature set from movie reviews. Inverse document frequency is computed and the feature set is reduced using Principal Component Analysis. The effectiveness of the pre-processing is evaluated using Naive Bayes and Linear Vector Quantization.

[1]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[2]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[3]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[4]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[5]  David Taniar,et al.  Domain-Driven, Actionable Knowledge Discovery , 2007, IEEE Intelligent Systems.

[6]  J. Kamps,et al.  Words with attitude , 2002 .

[7]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, SKDD.

[8]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[9]  Ming Zhou,et al.  Low-Quality Product Review Detection in Opinion Summarization , 2007, EMNLP.

[10]  Chengqi Zhang,et al.  Flexible Frameworks for Actionable Knowledge Discovery , 2010, IEEE Transactions on Knowledge and Data Engineering.

[11]  Zhu Zhang,et al.  Utility scoring of product reviews , 2006, CIKM '06.

[12]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[13]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[14]  Ari Rappoport,et al.  Enhanced Sentiment Learning Using Twitter Hashtags and Smileys , 2010, COLING.

[15]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[16]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.