Application of Semi-supervised Learning to Evaluative Expression Classification

We propose to use semi-supervised learning methods to classify evaluative expressions, that is, tuples of subjects, their attributes, and evaluative words, that indicate either favorable or unfavorable opinions towards a specific subject. Due to its characteristics, the semi-supervised method that we use can classify evaluative expressions in a corpus by their polarities. This can be accomplished starting from a very small set of seed training examples and using contextual information in the sentences to which the expressions belong. Our experimental results with actual Weblog data show that this bootstrapping approach can improve the accuracy of methods for classifying favorable and unfavorable opinions.

[1]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[2]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[3]  M. de Rijke,et al.  UvA-DARE ( Digital Academic Repository ) Using WordNet to measure semantic orientations of adjectives , 2004 .

[4]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[5]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[6]  Yuki Tanaka,et al.  Extraction and Classification of Facemarks with Kernel Methods , 2005 .

[7]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[8]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[9]  Janyce Wiebe,et al.  Just How Mad Are You? Finding Strong and Weak Opinion Clauses , 2004, AAAI.

[10]  Janyce Wiebe,et al.  Instructions for annotating opinions in newspaper articles , 2002 .

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  Yasuhiro Suzuki,et al.  Automatically collecting, monitoring, and mining japanese weblogs , 2004, WWW Alt. '04.

[13]  Thomas Hofmann,et al.  Statistical Models for Co-occurrence Data , 1998 .

[14]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[15]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[16]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[17]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[18]  Peter D. Turney Thumbs Up, Thumbs Down , 2013, Journal of Cell Science.

[19]  Hiroya Takamura,et al.  A Comparative Study on the Use of Labeled and Unlabeled Data for Large Margin Classifiers , 2004, IJCNLP.

[20]  Yuji Matsumoto,et al.  A Boosting Algorithm for Classification of Semi-Structured Text , 2004, EMNLP.

[21]  Yuki Tanaka,et al.  Extraction and classification of facemarks , 2005, IUI '05.

[22]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.