论文信息 - How Complementary Are Different Information Retrieval Techniques? A Study in Biomedicine Domain

How Complementary Are Different Information Retrieval Techniques? A Study in Biomedicine Domain

In this paper, we make an empirical study on the submitted runs to the TREC Genomics Track, a gathering for information retrieval research in biomedicine. Based on the evaluation criteria provided by the track, we investigate how much relevant information is generally lost from a run, and how well the relevant nominees are actually ranked w.r.t. the level of relevancy and how they are distributed among the irrelevant ones in a run. We examine whether the relevancy or the level of relevancy play a more important role in the performance evaluation. Answering these questions may give us some insight into and help us improve the current IR technologies. The study reveals that the recognition of relevancy is more important than that of level of relevancy. It indicates that on average more than 60% of relevant information is lost from each run w.r.t. to either the amount of relevant information or the amount of aspects subtopics, novelty or diversity, which suggests the big potential room for performance improvement. The study shows that the submitted runs from different groups are quite complementary, which implies ensemble IRs could significantly improve retrieval performance. The experiments illustrate that a run performs "good" or "bad" mainly due to its performance on its top 10% rankings, and the rest of the run only contributes to the performance marginally.

Nick Cercone | Xiangdong An | N. Cercone | X. An

[1] Yang Lingpeng,et al. Document re-ranking based on automatically acquired key terms in Chinese information retrieval , 2004, COLING 2004.

[2] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .

[3] Marti A. Hearst,et al. TREC 2004 Genomics Track Overview , 2005, TREC.

[4] Qinmin Hu,et al. A reranking model for genomics aspect search , 2008, SIGIR '08.

[5] John D. Lafferty,et al. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval , 2003, SIGIR.

[6] Xiaojin Zhu,et al. Ranking Biomedical Passages for Relevance and Diversity: University of Wisconsin, Madison at TREC Genomics 2006 , 2006, TREC.