IIT Kharagpur at TREC 2008 Blog Track
暂无分享,去创建一个
This paper describes our opinion retrieval system for TREC 2008 blog track. We focused on ve dierent aspects of the system. The rst module is focussed on extracting the blog content out from junk html and thereby decreasing the noise in the indexed content. The second module aims at removing various kind of spam content from real blogs. The third module aimed at retrieving the relevant documents. The fourth module lters out opinionated documents and the fth one calculated the polarity of the sentiments in the document. The nal ranked retrieval runs were based on various combination of settings in each module so as to study the eect
[1] Bo Pang,et al. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.
[2] Anupam Joshi,et al. BlogVox: Separating Blog Wheat from Blog Chaff , 2007, IJCAI 2007.
[3] Iadh Ounis,et al. The TREC Blogs06 Collection: Creating and Analysing a Blog Test Collection , 2006 .
[4] Bo Pang,et al. Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.