Opinion Feature Extraction Using Class Sequential Rules

The paper studies the problem of analyzing user comments and reviews of products sold online. Analyzing such reviews and producing a summary of them is very useful to both potential customers and product manufacturers. By analyzing reviews, we mean to extract features of products (also called opinion features) that have been commented by reviewers and deter-mine whether the opinions are positive or negative. This paper focuses on extracting opinion features from Pros and Cons, which typically consist of short phrases or incomplete sentences. We propose a language pattern based approach for this purpose. The language patterns are generated from Class Sequential Rules (CSR). A CSR is different from a classic sequential pattern because a CSR has a fixed class (or target). We propose an algorithm to mine CSR from a set of labeled training sequences. To perform extraction, the mined CSRs are transformed into language patterns, which are used to match Pros and Cons to extract opinion features. Experimental results show that the proposed approach is very effective.

[1]  Marti A. Hearst Direction-based text interpretation as an information access refinement , 1992 .

[2]  Béatrice Daille,et al.  Study and Implementation of Combined Techniques for Automatic Extraction of Terminology , 1994 .

[3]  Slava M. Katz,et al.  Technical terminology: some linguistic properties and an algorithm for identification in text , 1995, Natural Language Engineering.

[4]  Judith L. Klavans,et al.  Book Reviews: The Balancing Act: Combining Symbolic and Statistical Approaches to Language , 1997, CL.

[5]  Janyce Wiebe,et al.  Development and Use of a Gold-Standard Data Set for Subjectivity Classifications , 1999, ACL.

[6]  Andrew McCallum,et al.  Information Extraction with HMM Structures Learned by Stochastic Optimization , 2000, AAAI/IAAI.

[7]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[8]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[9]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[10]  Satoshi Morinaga,et al.  Mining product reputations on the Web , 2002, KDD.

[11]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[12]  Ramakrishnan Srikant,et al.  Mining newsgroups using networks arising from social behavior , 2003, WWW '03.

[13]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[14]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[15]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[16]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[17]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[18]  Kamal Nigam,et al.  Towards a Robust Metric of Opinion , 2004 .

[19]  Janyce Wiebe,et al.  Just How Mad Are You? Finding Strong and Weak Opinion Clauses , 2004, AAAI.

[20]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[21]  Razvan C. Bunescu,et al.  Collective Information Extraction with Relational Markov Networks , 2004, ACL.

[22]  Christian Jacquemin,et al.  Term Extraction and Automatic Indexing , 2005 .

[23]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[24]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.