Turning Online Product Reviews to Customer Knowledge: A Semantic-based Sentiment Classification Approach

Many product review websites have been established (e.g., epinion.com, Rateitall.com) for collecting user reviews for a variety of products. In addition, it has also become a common practice for merchants or product manufacturers to setup online forums that allow their customers to provide reviews or express opinions on products they are interested or have purchased. To facilitate merchants, product manufacturers, and customers in exploiting online product reviews for their marketing, product design, or purchasing decision making, classification of the products reviews into positive and negative categories is essential. In this study, we propose a Semantic-based Sentiment Classification (SSC) technique that constructs from a training set of precategorized product reviews a sentiment classification model on the basis of a collection of positive and negative cue features. Furthermore, the proposed SSC technique includes a semantic expansion mechanism that uses WordNet for expanding the given set of positive and negative cue features. On the basis of three product review corpora, our empirical evaluation results suggest that the proposed SSC technique achieves higher classification effectiveness than the traditional syntactic-level sentiment classification technique does. Moreover, the SSC technique with the use of few seed features (e.g., 10 or 20) can result in comparable classification effectiveness to that attained by the use of a comprehensive list of positive and negative cue features (a total of 4206 words) defined in the General Inquirer.

[1]  Aidan Finn,et al.  Learning to classify documents according to genre , 2006, J. Assoc. Inf. Sci. Technol..

[2]  Christiane Fellbaum,et al.  Modifiers in WordNet , 1998 .

[3]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[4]  G. A. Mishne,et al.  Expiriments with mood classification in blog posts , 2005, SIGIR 2005.

[5]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[6]  Yoram Singer,et al.  Context-sensitive learning methods for text categorization , 1996, SIGIR '96.

[7]  Benno Stein,et al.  Genre classification of Web pages user study and feasibility analysis , 2004 .

[8]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[9]  Yiming Yang,et al.  An example-based mapping method for text categorization and retrieval , 1994, TOIS.

[10]  Samuel T. Waters,et al.  American Association for Artificial Intelligence (AAAI) , 1988 .

[11]  David E. Johnson,et al.  Maximizing Text-Mining Performance , 1999 .

[12]  Chao Wang,et al.  A semantic classification approach for online product reviews , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[13]  Chih-Ping Wei,et al.  Semantic Enrichment in Knowledge Repositories: Anotating Semantic Relationships Between Discussion Documents , 2006, J. Database Manag..

[14]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[15]  Atro Voutilainen,et al.  NPtool, a Detector of English Noun Phrases , 1995, VLC@ACL.

[16]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[17]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[18]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[19]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.

[20]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[21]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[22]  Carol Van Ess-Dykema,et al.  The Form is the Substance: Classification of Genres in Text , 2001, HTLKM@ACL.

[23]  Shi Bing,et al.  Inductive learning algorithms and representations for text categorization , 2006 .