Probabilistic Language Modelling for Context-Sensitive Opinion Mining

Existing opinion mining methods often utilize a static lexicon-based approach or a supervised machine learning approach to identify sentiment indicators from texts. Nevertheless, the former method often fails to identify context-sensitive semantics of the opinion indicators, and the latter approach requires a large number of human labeled training examples. The main contribution of this paper is the development of a novel opinion mining method underpinned by context-sensitive text mining and probabilistic language modeling method to improve the effectiveness of opinion mining. Our initial experiments show that the proposed the opinion mining method outperforms the purely lexicon-based method in terms of several benchmark measures. The practical implication of our research work is that business managers can apply our methodology to more effectively extract business analytics from user-contributed contents in the Social Web.

[1]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[2]  Janyce Wiebe,et al.  Just How Mad Are You? Finding Strong and Weak Opinion Clauses , 2004, AAAI.

[3]  Raymond Y. K. Lau,et al.  Toward a Fuzzy Domain Ontology Extraction Method for Adaptive e-Learning , 2009, IEEE Transactions on Knowledge and Data Engineering.

[4]  A. Nadas,et al.  Estimation of probabilities in the language model of the IBM speech recognition system , 1984 .

[5]  Kam-Fai Wong,et al.  Learning Knowledge from Relevant Webpage for Opinion Analysis , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[6]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[7]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[8]  Raymond Y. K. Lau Context-sensitive text mining and belief revision for intelligent information retrieval on the web , 2003, Web Intell. Agent Syst..

[9]  Craig MacDonald,et al.  Overview of the TREC 2007 Blog Track , 2007, TREC.

[10]  Panagiotis G. Ipeirotis,et al.  Show me the money!: deriving the pricing power of product features by mining consumer reviews , 2007, KDD '07.

[11]  Hui Zhang,et al.  WIDIT in TREC 2007 Blog Track: Combining Lexicon-Based Methods to Detect Opinionated Blogs , 2007, TREC.

[12]  Jian-Yun Nie,et al.  Inferential language models for information retrieval , 2006, TALIP.

[13]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[14]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[15]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[16]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.