Team COMMIT at TREC 2011
暂无分享,去创建一个
We describe the participation of Team COMMIT in this year's Microblog and Entity track. Team COMMIT participated in two tracks this year: the Mi- croblog track and the Entity track. In our participation in the Microblog track, we used a feature-based approach. Specifically, we pursued a preci- sion oriented recency-aware retrieval approach for tweets. Amongst others we used various types of external data. In particular, we examined the potential of link retrieval on a corpus of crawled content pages and we use semantic query expansion using Wikipedia. We also deployed pre-filtering based on query-dependent and query-independent features. Our main finding is that cutting-off the result list is difficult and crucial for good results. We also found that using exter- nal data helps with recall and precision. For our participation in this year's Entity track we focused on the Entity List Completion (ELC) task. We experimented with a text-based and a link-based approach to retrieving entities from Linked Data (LD). Additionally, we experi- mented with selecting candidate entities from a web corpus. Our hypothesis is that entities occurring on pages with many of the example entities are more likely to be good candidates than entities that do not. Due to the absence of evaluation results at the time of writing, we have no preliminary con- clusions yet. The remainder of the paper consists of two largely inde- pendent sections, one for each of the tracks in which we par- ticipated, plus a conclusion.
[1] Thomas Gottron,et al. Bad news travel fast: a content-based analysis of interestingness on Twitter , 2011, WebSci '11.
[2] M. de Rijke,et al. Credibility Improves Topical Blog Post Retrieval , 2008, ACL.
[3] M. de Rijke,et al. Incorporating Query Expansion and Quality Indicators in Searching Microblog Posts , 2011, ECIR.
[4] Jungyun Seo,et al. SiteQ: Engineering High Performance QA System Using Lexico-Semantic Pattern Matching and Shallow NLP , 2001, TREC.