Comparing semantic associations in sentences and paragraphs for opinion detection in blogs

Opinion Detection is one of the most interesting and challenging work in the field of Information Retrieval. Lot of research work already exists in this area with some distinctive work. A review of the reveals that researchers have been working on different levels of granularity like documents, passages, sentences and words for the task of opinion detection. In this work we revise our previous approach that combines document level heuristics with a semantic similarity based method. We evaluate this semantic similarity approach on a huge data collection using three different setups involving both sentences and passages and then compare the performance of our approach with these different setups. For evaluation purposes, we are using TREC Blog 2006 collection (148 GB) with 50 topics of TREC Blog 2006 over baseline obtained through Terrier Information System Platform. Results show that our approach improves the baseline opinion MAP by 28.89%, 30.13% and 32.26% using setup one, two and three respectively.

[1]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[2]  Jay F. Nunamaker,et al.  An exploratory study into deception detection in text-based computer-mediated communication , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[3]  Craig MacDonald,et al.  Overview of the TREC 2007 Blog Track , 2007, TREC.

[4]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[5]  Hui Zhang,et al.  WIDIT in TREC 2006 Blog Track , 2006, TREC.

[6]  Tejashri Inadarchand Jain,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2010 .

[7]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[8]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[9]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[10]  Coskun Bayrak,et al.  Topic Categorization for Relevancy and Opinion Detection , 2007, TREC.

[11]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[12]  Iadh Ounis,et al.  Overview of the TREC 2008 Blog Track , 2008, TREC.

[13]  Luo Si,et al.  Knowledge Transfer and Opinion Detection in the TREC2006 Blog Track , 2006 .

[14]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[15]  M. de Rijke,et al.  The University of Amsterdam at the TREC 2007 Blog Track , 2007 .

[16]  Craig MacDonald,et al.  Overview of the TREC 2006 Blog Track , 2006, TREC.

[17]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[18]  Iadh Ounis,et al.  Research directions in Terrier: a search engine for advanced retrieval on the Web , 2007 .

[19]  Fabio Crestani,et al.  Investigating Learning Approaches for Blog Post Opinion Retrieval , 2009, ECIR.

[20]  James P. Callan,et al.  Passage-level evidence in document retrieval , 1994, SIGIR '94.

[21]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[22]  Justin Zobel,et al.  Effective ranking with arbitrary passages , 2001, J. Assoc. Inf. Sci. Technol..

[23]  Grace Hui Yang,et al.  Knowledge Transfer and Opinion Detection in the TREC 2006 Blog Track , 2006, TREC.

[24]  Mohand Boughanem,et al.  Using WordNet's Semantic Relations for Opinion Detection in Blogs , 2009, ECIR.

[25]  Charles J. Gray Adolescent Blogging: A Comparison of Developmental Psychology and Self-Depiction in Adolescent Blogs , 2005 .