Fusion approach to finding opinionated blogs

In this paper, we describe a fusion approach to finding opinionated blog postings. Our approach to opinion blog retrieval consisted of first applying traditional IR methods to retrieve on-topic blogs and then boosting the ranks of opinionated blogs based on combined opinion scores generated by multiple assessment methods. Our opinion module is composed of the Opinion Term Module, which identifies opinions based on the frequency of opinion terms (i.e., terms that occur frequently in opinion blogs), the Rare Term Module, which uses uncommon/rare terms (e.g., “sooo good”) for opinion classification, the IU Module, which uses IU (I and you) collocations, and the Adjective-Verb Module, which uses computational linguistics' distribution similarity approach to learn the subjective language from training data.

[1]  Garrison W. Cottrell,et al.  Automatic combination of multiple ranked retrieval systems , 1994, SIGIR '94.

[2]  James Allan,et al.  Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[3]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[4]  Miles Efron The liberal media and right-wing conspiracies: using cocitation information to estimate political orientation in web documents , 2004, CIKM.

[5]  Amit Singhal,et al.  Pivoted document length normalization , 1996, SIGIR 1996.

[6]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[7]  G Mednick Networking for the novice. , 1995, American journal of orthodontics and dentofacial orthopedics : official publication of the American Association of Orthodontists, its constituent societies, and the American Board of Orthodontics.

[8]  David R. Pierce,et al.  Identifying Opinionated Sentences , 2003, NAACL.

[9]  Ning Yu,et al.  WIDIT in TREC 2004 Genomics, Hard, Robust and Web Tracks , 2004, TREC.

[10]  Gilad Mishne,et al.  Deriving wishlists from blogs show us your blog, and we'll tell you what books to buy , 2006, WWW '06.

[11]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[12]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[13]  Ning Yu,et al.  WIDIT: Fusion-Based Approach to Web Search Optimization , 2005, AIRS.

[14]  Inna Kouper,et al.  Conversations in the Blogosphere: An Analysis "From the Bottom Up" , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[15]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[16]  Kiduk Yang Combining Text- and Link-based Retrieval Methods for Web IR , 2001, TREC.

[17]  Venkata Subramaniam,et al.  Information Retrieval: Data Structures & Algorithms , 1992 .

[18]  Timothy Chklovski,et al.  Deriving quantitative overviews of free text assessments on the web , 2006, IUI '06.

[19]  J. Roger Humphries,et al.  Reverse osmosis environmental remediation. Development and demonstration pilot project , 2004 .

[20]  Chris Buckley,et al.  Using Query Zoning and Correlation Within SMART: TREC 5 , 1996, TREC.

[21]  Jong-Hak Lee,et al.  Analyses of multiple evidence combination , 1997, SIGIR '97.

[22]  Yanwu Lu,et al.  UV-visible reflectance spectra of nanocrystalline silver compacted under different pressures [rapid communication] , 2005 .