Domain Specific Opinion Retrieval

Opinion retrieval is a novel information retrieval task and has attracted a great deal of attention with the rapid increase of online opinionated information. Most previous work adopts the classical two stage framework, i.e., first retrieving topic relevant documents and then re-ranking them according to opinion relevance. However, none has considered the problem of domain coherence between queries and topic relevant documents. In this work, we propose to address this problem based on the similarity measure of the usage of opinion words (which users employ to express opinions). Our work is based on the observation that the opinion words are domain dependent. We reformulate this problem as measuring the opinion similarity between domain opinion models of queries and document opinion models. Opinion model is constructed to capture the distribution of opinion words. The basic idea is that if a document has high opinion similarity with a domain opinion model, it indicates that it is not only opinionated but also in the same domain with the query (i.e., domain coherence). Experimental results show that our approach performs comparatively with the state-of-the-art work.

[1]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[2]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[3]  Wei Zhang,et al.  Opinion retrieval from blogs , 2007, CIKM '07.

[4]  Craig MacDonald,et al.  An effective statistical approach to blog post opinion retrieval , 2008, CIKM '08.

[5]  Philip J. Stone,et al.  Extracting Information. (Book Reviews: The General Inquirer. A Computer Approach to Content Analysis) , 1967 .

[6]  Craig MacDonald,et al.  Ranking opinionated blog posts using OpinionFinder , 2008, SIGIR '08.

[7]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[8]  Koji Eguchi,et al.  Sentiment Retrieval using Generative Models , 2006, EMNLP.

[9]  Hui Zhang,et al.  WIDIT in TREC 2006 Blog Track , 2006, TREC.

[10]  Gilad Mishne Multiple Ranking Strategies for Opinion Retrieval in Blogs - The University of Amsterdam at the 2006 TREC Blog Track , 2006, TREC.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Craig MacDonald,et al.  Overview of the TREC 2007 Blog Track , 2007, TREC.

[13]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss classification , 2005, CIKM '05.

[14]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[15]  Wei Zhang,et al.  Improve the effectiveness of the opinion retrieval and opinion polarity classification , 2008, CIKM '08.

[16]  Clement Yu,et al.  UIC at TREC 2008 Blog Track , 2008 .

[17]  Craig MacDonald,et al.  Overview of the TREC 2006 Blog Track , 2006, TREC.

[18]  Qiang Yang,et al.  Q2C@UST: our winning solution to query classification in KDDCUP 2005 , 2005, SKDD.

[19]  Min Zhang,et al.  A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval , 2008, SIGIR '08.

[20]  Gilad Mishne Using Blog Properties to Improve Retrieval , 2007, ICWSM.