Opinion retrieval from blogs

Opinion retrieval is a document retrieval process, which requires documents to be retrieved and ranked according to their opinions about a query topic. A relevant document must satisfy two criteria: relevant to the query topic, and contains opinions about the query, no matter if they are positive or negative. In this paper, we describe an opinion retrieval algorithm. It has a traditional information retrieval (IR) component to find topic relevant documents from a document set, an opinion classification component to find documents having opinions from the results of the IR step, and a component to rank the documents based on their relevance to the query, and their degrees of having opinions about the query. We implemented the algorithm as a working system and tested it using TREC 2006 Blog Track data in automatic title-only runs. Our result showed 28% to 32% improvements in MAP score over the best automatic runs in this 2006 track. Our result is also 13% higher than a state-of-art opinion retrieval system, which is tested on the same data set.

[1]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[2]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[3]  Jimmy J. Lin,et al.  TREC 2006 at Maryland: Blog, Enterprise, Legal and QA Tracks , 2006, TREC.

[4]  Clement T. Yu,et al.  An effective approach to document retrieval via utilizing WordNet and recognizing phrases , 2004, SIGIR '04.

[5]  Wei Zhang,et al.  UIC at TREC 2006 Blog Track , 2006, TREC.

[6]  Gilad Mishne Multiple Ranking Strategies for Opinion Retrieval in Blogs - The University of Amsterdam at the 2006 TREC Blog Track , 2006, TREC.

[7]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[8]  Hui Zhang,et al.  WIDIT in TREC 2006 Blog Track , 2006, TREC.

[9]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[10]  Hsin-Hsi Chen,et al.  Major topic detection and its application to opinion summarization , 2005, SIGIR '05.

[11]  H. Chernoff,et al.  The Use of Maximum Likelihood Estimates in {\chi^2} Tests for Goodness of Fit , 1954 .

[12]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[13]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss classification , 2005, CIKM '05.

[14]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[15]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Robust Retrieval Track , 2004 .

[16]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[17]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[18]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[19]  Gilad Mishne Using Blog Properties to Improve Retrieval , 2007, ICWSM.

[20]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[21]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[22]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[23]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss analysis , 2005, CIKM 2005.

[25]  Shlomo Argamon,et al.  Using appraisal groups for sentiment analysis , 2005, CIKM '05.

[26]  Wei Zhang,et al.  Recognition and classification of noun phrases in queries for effective retrieval , 2007, CIKM '07.

[27]  Craig MacDonald,et al.  Overview of the TREC 2006 Blog Track , 2006, TREC.

[28]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[29]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[30]  Stephen E. Robertson,et al.  Okapi/Keenbow at TREC-8 , 1999, TREC.

[31]  Clement Yu,et al.  UIC at TREC 2008 Blog Track , 2008 .

[32]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[33]  Bing Liu,et al.  Identifying comparative sentences in text documents , 2006, SIGIR.