AN OPINION FEATURE EXTRACTION APPROACH BASED ON A MULTIDIMENSIONAL SENTENCE ANALYSIS MODEL

With Web 2.0 applications being widely used, social networking services, including web blogs, forums, and other online communities, have become informative tools that help individuals to easily gauge the pulse of the electronic consuming market. As a substitute for traditional public media, the related site provides unique mechanisms to instantly reveal the degree of public product acceptance by either statistically aggregating the rating results or archiving opinions shared by experienced customers. However, the growth of user-generated information and its scattered unstructured contents is overwhelming to users, thereby triggering the demand for a more efficient system that can offer concise information. Most existing efforts dedicated to these issues may neglect vital aspects of the sentence-level context. This article aims to explore the critical features hidden in the sentential structure of opinion articles and expects that the detected patterns may contribute to the enhancement of related applications. Accordingly, a multidimensional sentence modeling algorithm (MSMA) is designed to evaluate various sentential characteristics and adopt a genetic algorithm to optimize the weighting scheme while determining feature importance. The study also makes use of the public knowledge resource Wikipedia as a global reference to fine-tune the feature set's effectiveness and enhance the overall performance of this framework. The results of experiments on an electronic product data set demonstrate that the proposed method is promising and provides significant improvement over previous studies.

[1]  Chrysanthos Dellarocas,et al.  The Digitization of Word-of-Mouth: Promise and Challenges of Online Feedback Mechanisms , 2003, Manag. Sci..

[2]  Daniel Dajun Zeng,et al.  Sentiment analysis of Chinese documents: From sentence to document level , 2009, J. Assoc. Inf. Sci. Technol..

[3]  Fuji Ren,et al.  GA, MR, FFNN, PNN and GMM based models for automatic text summarization , 2009, Comput. Speech Lang..

[4]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[5]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[6]  ChengXiang Zhai,et al.  Comprehensive Review of Opinion Summarization , 2011 .

[7]  David Godes,et al.  Using Online Conversations to Study Word-of-Mouth Communication , 2004 .

[8]  Han Tong Loh,et al.  Gather customer concerns from online product reviews - A text summarization approach , 2009, Expert Syst. Appl..

[9]  Stephen Shaoyi Liao,et al.  Mining comparative opinions from customer reviews for Competitive Intelligence , 2011, Decis. Support Syst..

[10]  Bin Gu,et al.  Informational Cascades and Software Adoption on the Internet: An Empirical Investigation , 2008, MIS Q..

[11]  Dong-Hong Ji,et al.  Chinese Multi-document Summarization Using Adaptive Clustering and Global Search Strategy , 2006, PRICAI.

[12]  Yannis Bakos,et al.  A Strategic Analysis of Electronic Marketplaces , 1991, MIS Q..

[13]  Milad Shokouhi,et al.  Introduction to special issue on the second international conference on the theory of information retrieval , 2010, Information Retrieval.

[14]  Robert M. Schindler,et al.  Internet forums as influential sources of consumer information , 2001 .

[15]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[16]  Hsin-Hsi Chen,et al.  Mining opinions from the Web: Beyond relevance retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[17]  Yuen-Hsien Tseng,et al.  Patent surrogate extraction and evaluation in the context of patent mapping , 2007, J. Inf. Sci..

[18]  Chu-Ren Huang,et al.  Sentiment Classification and Polarity Shifting , 2010, COLING.

[19]  Wei-Pang Yang,et al.  iSpreadRank: Ranking sentences for extraction-based summarization using feature weight propagation in the sentence similarity network , 2008, Expert Syst. Appl..

[20]  Gerard Salton,et al.  Automatic Text Structuring and Summarization , 1997, Inf. Process. Manag..

[21]  Hiroya Takamura,et al.  Sentiment Classification Using Word Sub-sequences and Dependency Sub-trees , 2005, PAKDD.

[22]  David Kauchak,et al.  Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in NLP, pages 32--39, , 2007 .

[23]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[24]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[25]  Vasudeva Varma,et al.  Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents , 2010, Information Retrieval.

[26]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[27]  Janyce Wiebe,et al.  Recognizing subjectivity: a case study in manual tagging , 1999, Natural Language Engineering.

[28]  Daniel Dajun Zeng,et al.  Mining Fine Grained Opinions by Using Probabilistic Models and Domain Knowledge , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[29]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[30]  Sophia Ananiadou,et al.  Developing a Robust Part-of-Speech Tagger for Biomedical Text , 2005, Panhellenic Conference on Informatics.