A Semi-Supervised Self-Adaptive Classifier over Opinionated Streams

We investigate the problem of polarity learning over a stream of opinionated documents. We deal with two challenges. First, if the opinions are not labeled, then we cannot assume that a human expert will be regularly and frequently available to assess the sentiment of arriving documents for learning and model adaption. Further, the vocabulary of the stream, and thus the feature space used for learning, changesover time: people use an abundancy of words, and sometime seven invent new ones to express their feelings. We propose a semi-supervised opinion stream classification algorithm that uses only an initial training set of labeled documents for polarity learning and gradually adapts to changes in the vocabulary. In particular, our algorithm S*3 Learner starts with the vocabulary of opinionated words that are in the documents of the initial training set, and then expands it with new words, as soon as there is enough evidence for estimating their polarity. We study the performance of S*3 Learneron opinionated streams under the natural order of document arrival and under a modified ordering that allows us to simulate vocabulary evolution.

[1]  Myra Spiliopoulou,et al.  Adaptive semi supervised opinion classifier with forgetting mechanism , 2014, SAC.

[2]  Meng Wang,et al.  Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews , 2011, EMNLP.

[3]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[4]  Geoff Holmes,et al.  MOA-TweetReader: Real-Time Analysis in Twitter Streaming Data , 2011, Discovery Science.

[5]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[6]  Luís Torgo,et al.  Classifying News Stories with a Constrained Learning Strategy to Estimate the Direction of a Market Index , 2012, Int. J. Comput. Sci. Appl..

[7]  Dong Wang,et al.  A Cross-corpus Study of Unsupervised Subjectivity Identification based on Calibrated EM , 2011, WASSA@ACL.

[8]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[9]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[10]  Stanley C. Fralick,et al.  Learning to recognize patterns without a teacher , 1967, IEEE Trans. Inf. Theory.

[11]  Karl Aberer,et al.  Entity-based Classification of Twitter Messages , 2012, Int. J. Comput. Sci. Appl..

[12]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[13]  Marcelo Mendoza,et al.  Combining strengths, emotions and polarities for boosting Twitter sentiment analysis , 2013, WISDOM '13.

[14]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[15]  Felipe Bravo-Marquez,et al.  Meta-level sentiment models for big social data analysis , 2014, Knowl. Based Syst..

[16]  Prem Melville,et al.  Sentiment analysis of blogs by combining lexical knowledge with text classification , 2009, KDD.

[17]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[18]  Deyu Zhou,et al.  Self-training from labeled features for sentiment analysis , 2011, Inf. Process. Manag..

[19]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[20]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.