Implicit feature identification via hybrid association rule mining

In sentiment analysis, a finer-grained opinion mining method not only focuses on the view of the product itself, but also focuses on product features, which can be a component or attribute of the product. Previous related research mainly relied on explicit features but ignored implicit features. However, the implicit features, which are implied by some words or phrases, are so significant that they can express the users' opinion and help us to better understand the users' comments. It is a big challenge to detect these implicit features in Chinese product reviews, due to the complexity of Chinese. This paper is mainly centered on implicit features identification in Chinese product reviews. A novel hybrid association rule mining method is proposed for this task. The core idea of this approach is mining as many association rules as possible via several complementary algorithms. Firstly, we extract candidate feature indicators based word segmentation, part-of-speech (POS) tagging and feature clustering, then compute the co-occurrence degree between the candidate feature indicators and the feature words using five collocation extraction algorithms. Each indicator and the corresponding feature word constitute a rule (feature indicator -> feature word). The best rules in five different rule sets are chosen as the basic rules. Next, three methods are proposed to mine some possible reasonable rules from the lower co-occurrence feature indicators and non indicator words. Finally, the latest rules are used to identify implicit features and the results are compared with the previous. Experiment results demonstrate that our proposed approach is competent at the task, especially via using several expanding methods. The recall is effectively improved, suggesting that the shortcomings of the basic rules have been overcome to certain extent. Besides those high co-occurrence degree indicators, the final rules also contain uncommon rules.

[1]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[2]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[3]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[4]  Ting Liu,et al.  Building a Dependency Treebank for Improving Chinese Parser , 2006, J. Chin. Lang. Comput..

[5]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[6]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[7]  Zhen Hai,et al.  Implicit Feature Identification via Co-occurrence Association Rule Mining , 2011, CICLing.

[8]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[9]  Xinying Xu,et al.  Hidden sentiment association in chinese web opinion mining , 2008, WWW.

[10]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[11]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[12]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[13]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Deyu Zhou,et al.  Self-training from labeled features for sentiment analysis , 2011, Inf. Process. Manag..

[16]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[17]  Bing Liu,et al.  Opinion Feature Extraction Using Class Sequential Rules , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[18]  Eric K. Ringger,et al.  Pulse: Mining Customer Opinions from Free Text , 2005, IDA.

[19]  Shiwen Yu,et al.  Using Pointwise Mutual Information to Identify Implicit Features in Customer Reviews , 2006, ICCPOL.

[20]  Panagiotis G. Ipeirotis,et al.  Show me the money!: deriving the pricing power of product features by mining consumer reviews , 2007, KDD '07.

[21]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[22]  Olga Vechtomova Facet-based opinion retrieval from blogs , 2010, Inf. Process. Manag..

[23]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.