论文信息 - York University at TREC 2007: Genomics Track

York University at TREC 2007: Genomics Track

Our Genomics experiments in this year mainly focus on improving the passage retrieval performance in the biomedical domain. We address this problem by constructing difierent indexes. In particular, we propose a method to build word-based index and sentence-based index for our experiments. The passage mean average precision (passage MAP) for our flrst run \york07ga1" using the word-based index was 0.095 and the passage MAP for our second run \york07ga2" using the sentence-based index was 0.086. However, the passage MAP for our third run \york07ga3" using both the word-based index and UMLS for query expansion degraded to 0.060. All these three o‐cial runs are automatic. The evaluation results show that using the word-based index is more efiective than using the sentence-based index for improving the passage retrieval performance. We flnd that pseudo-relevance feedback can make a positive contribution to the retrieval performance. However, we also flnd that query expansion using UMLS and Entrez Gene does not improve the retrieval performance, and in some cases it makes a negative contribution to the retrieval performance.

[1] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[2] Ming Zhong,et al. York University at TREC 2004: HARD and Genomics Tracks , 2004, TREC.

[3] Stephen E. Robertson,et al. Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[4] Charles L. A. Clarke,et al. Domain-Specific Synonym Expansion and Validation for Biomedical Information Retrieval (MultiText Experiments for TREC 2004) , 2004, TREC.

[6] Stephen E. Robertson,et al. Okapi at TREC-3 , 1994, TREC.

[7] Stephen E. Robertson,et al. Okapi at TREC-5 , 1996, TREC.

[8] Clement T. Yu,et al. A Concept-Based Framework for Passage Retrieval at Genomics , 2006, TREC.