Discovering and monitoring product features and the opinions on them with OPINSTREAM

Opinion stream mining encompasses methods for monitoring and understanding how people's attitude towards products changes over time. For many applications, though, not only a specific product is of interest but also the properties that people consider important for the whole category of products. Understanding which product features influence a buyer's choice positively or negatively allows decision makers to make well-informed decisions on improving their products or marketing them properly. In this study, we propose OPINSTREAM, a framework for the discovery and polarity monitoring of implicit product features deemed important in the people's reviews on different products. Our framework encompasses stream clustering, extraction of product features from the clusters, cluster adaptation and semi-supervised sentiment learning inside each cluster. These components build upon our earlier work on product feature discovery and monitoring (M. Zimmermann, E. Ntoutsi, Z.F. Siddiqui, M. Spiliopoulou, H.-P. Kriegel, Discovering global and local bursts in a stream of news, in: Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC'12, ACM, 2012., M. Zimmermann, E. Ntoutsi, M. Spiliopoulou, Extracting opinionated (sub)features from a stream of product reviews, in: Proceedings of the 16th International Conference on Discovery Science (DS'2013), Lecture Noteson Computer Science, vol. 8140, Springer, Singapore, 2013, pp. 340-355., M. Zimmermann, E. Ntoutsi, M. Spiliopoulou,Adaptive semi supervised opinion classifier with for getting mechanism (to appear), in: Proceedings of the 29th Annual ACM Symposium on Applied Computing, SAC'14, ACM, 2014. ), with emphasis on smooth cluster adaptation. We report on the performance of OPINSTREAM on two real datasets with product reviews, whereby we evaluate both the stream clustering approach for product feature monitoring and the semi-supervised polarity monitoring method.

[1]  Myra Spiliopoulou,et al.  FINGERPRINT: Summarizing Cluster Evolution in Dynamic Environments , 2012, Int. J. Data Warehous. Min..

[2]  A. Viera,et al.  Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[3]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Luís Torgo,et al.  Classifying News Stories with a Constrained Learning Strategy to Estimate the Direction of a Market Index , 2012, Int. J. Comput. Sci. Appl..

[6]  Geoff Holmes,et al.  Active Learning with Evolving Streaming Data , 2011, ECML/PKDD.

[7]  Yun Chi,et al.  Evolutionary spectral clustering by incorporating temporal smoothness , 2007, KDD '07.

[8]  Vipin Kumar,et al.  Chapman & Hall/CRC Data Mining and Knowledge Discovery Series , 2008 .

[9]  Martin Ester,et al.  Opinion digger: an unsupervised opinion miner from unstructured product reviews , 2010, CIKM.

[10]  Jian Yin,et al.  Clustering Text Data Streams , 2008, Journal of Computer Science and Technology.

[11]  Hans-Peter Kriegel,et al.  Discovering global and local bursts in a stream of news , 2012, SAC '12.

[12]  Myra Spiliopoulou,et al.  Extracting Opinionated (Sub)Features from a Stream of Product Reviews , 2013, Discovery Science.

[13]  Myra Spiliopoulou,et al.  Topic Evolution in a Stream of Documents , 2009, SDM.

[14]  Stanley C. Fralick,et al.  Learning to recognize patterns without a teacher , 1967, IEEE Trans. Inf. Theory.

[15]  Philip S. Yu,et al.  A Framework for Clustering Massive Text and Categorical Data Streams , 2006, SDM.

[16]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[17]  Deepayan Chakrabarti,et al.  Evolutionary clustering , 2006, KDD '06.

[18]  Myra Spiliopoulou,et al.  Adaptive semi supervised opinion classifier with forgetting mechanism , 2014, SAC.

[19]  Pushpak Bhattacharyya,et al.  Feature Specific Sentiment Analysis for Product Reviews , 2012, CICLing.

[20]  Daniel Barbará,et al.  On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[21]  Wagner Meira,et al.  Effective sentiment stream analysis with self-augmenting training and demand-driven projection , 2011, SIGIR.

[22]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[23]  Geoff Holmes,et al.  MOA-TweetReader: Real-Time Analysis in Twitter Streaming Data , 2011, Discovery Science.

[24]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[25]  Myra Spiliopoulou,et al.  MONIC: modeling and monitoring cluster transitions , 2006, KDD '06.

[26]  Meng Wang,et al.  Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews , 2011, EMNLP.

[27]  Myra Spiliopoulou,et al.  Discovering Emerging Topics in Unlabelled Text Collections , 2006, ADBIS.

[28]  Michèle Sebag,et al.  Toward autonomic grids: analyzing the job flow with affinity streaming , 2009, KDD.

[29]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[30]  Jingbo Zhu,et al.  Multi-aspect opinion polling from textual reviews , 2009, CIKM.