A weighted lexicon-based generative model for opinion retrieval

In recent years, opinion retrieval attracted a growing research interest as online users' opinions become more and more valuable for market survey, political polls, etc. The goal of opinion retrieval is to find relevant and opinionate documents according to a user's query. Compared with previous lexicon-based generative model for opinion retrieval considering that the sentiment words are equal for a query, which cannot reflect different sentiment words' relevant opinion strength, we propose a graph-based approach by using HITS model to capture the sentiment words' relevant opinion strength. Then the weights are incorporated into the weighted lexicon-based generative model for opinion retrieval. Experimental results on two datasets show the effectiveness of the proposed generative model. Compared with the baseline approach, improvements of 4% and 11% have been obtained on two real datasets.

[1]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 1 , 2000, Inf. Process. Manag..

[2]  Craig MacDonald,et al.  Integrating Proximity to Subjective Sentences for Blog Opinion Retrieval , 2009, ECIR.

[3]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss classification , 2005, CIKM '05.

[4]  Xuanjing Huang,et al.  A unified relevance model for opinion retrieval , 2009, CIKM.

[5]  Koji Eguchi,et al.  Sentiment Retrieval using Generative Models , 2006, EMNLP.

[6]  Yue Lu,et al.  Automatic construction of a context-aware sentiment lexicon: an optimization approach , 2011, WWW.

[7]  Kam-Fai Wong,et al.  A Unified Graph Model for Sentence-Based Opinion Retrieval , 2010, ACL.

[8]  Wei Zhang,et al.  Opinion retrieval from blogs , 2007, CIKM '07.

[9]  Craig MacDonald,et al.  An effective statistical approach to blog post opinion retrieval , 2008, CIKM '08.

[10]  Maarten de Rijke,et al.  A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections , 2009, ACL/IJCNLP.

[11]  Yue Zhang,et al.  Target-Dependent Twitter Sentiment Classification with Rich Automatic Features , 2015, IJCAI.

[12]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[13]  Min Zhang,et al.  A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval , 2008, SIGIR '08.

[14]  Craig MacDonald,et al.  Overview of the TREC 2007 Blog Track , 2007, TREC.

[15]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[16]  Xiaolong Wang,et al.  Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach , 2011, CIKM '11.

[17]  Hsin-Hsi Chen,et al.  Overview of Multilingual Opinion Analysis Task at NTCIR-7 , 2008, NTCIR.

[18]  David E. Losada,et al.  Effective and efficient polarity estimation in blogs based on sentence-level evidence , 2011, CIKM '11.

[19]  Martin Ester,et al.  On the design of LDA models for aspect-based opinion mining , 2012, CIKM.

[20]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[21]  Huan Liu,et al.  Unsupervised sentiment analysis with emotional signals , 2013, WWW.

[22]  Ting Wang,et al.  Opinion Retrieval in Twitter , 2012, ICWSM.